ML.Net Tutorial 3 – Sentiment Analysis Using TensorFlow

We would be using TensorFlow model for this class. But you don’t have to worry if you don’t know TensorFlow. This is because we will simply import a TensorFlow model and use it in ML.Net.

In case you want to know, TensorFlow is a library developed by Google for data science and machine learning modelling.

TensorFlow Tutorials here 

 

Prerequisite

Visual Studio 2017

We would cover the following topics:

  1. Obtain the Model and Dataset
  2. Set up the Project in Visual Studio
  3. Add Global Variables
  4. How the Model Works
  5. Create the Classes
  6. Create the ML Context, Lookup Dictionary and Resize action
  7. Load the TensorFlow Model
  8. Create a Learning Pipeline
  9. Make Prediction and Display the Output

 

 

1. Obtain the Model and Dataset

The first step is to download the model and the dataset. They are available here.

Unzip the file into a local directory. The folder contains two files and one folder:

imdb_word_index.csv: the is a file containing a mapping of words to integer values.

saved_model.pb: this is the tensorflow model. pb stands for protobuf. This is a file that contains the graph definition of the model as well as the weights. The model takes a fixed length string (600 char) as input. This represents the review text. It output two numbers(probabilities) as output: P(+ve) and P(-ve)

variables: this is a folder containing two files used by the model.

 

2. Set up the Project in Visual Studio

Open Visual Studio

Create a project and name it SentimentAnalysisTensorFlow (you can use another name but it’s better to stick with this for now)

Add the following packages using Nugget Package Manager

  • Microsoft.ML
  • Microsoft.ML.TensorFlow

Create a folder in the project. Name it Data

Copy the content of the folder sentiment_model folder into the Data folder

Then set the ‘Copy to Output Directory’ properties of the files to ‘Copy if newer’

 

3. Add the Global Variables

Since the model expects an array of length 600, define an integer variable FeatureLength. Set it to 600

Also define the path to the model.

The code to do this is shown below. Place it before the main method

public const int FeatureLength = 600;
static readonly string _modelPath = Path.Combine(Environment.CurrentDirectory, "sentiment_model");

 

4. How the Model Works

The movie review is a sentence made by a user.

So the first step is to split the sentence into an array of words

Next, we use the mapping file to map the array of words into variable-length array of integers

Then, we resize this array into a fixed size array

Finally, we make prediction by feeding this array into the model and obtain the prediction output

 

5. Create the Classes

Based on the above, we would need to create four classes:

  • class to hold the input sentence (review) – MovieReview
  • class to hold the array of words
  • class to hold the variable-length array of integers – VariableLength
  • a class to hold fixed-length array of integers – FixedLength
  • class to hold the prediction (output) – SentimentPrediction

These classes are given below:

// This class holds the original sentiment data
class MovieReview
{
    public string ReviewText { get; set; }
}

 

Next class

// This class holds the variable length feature 
// i.e reviewtext mapped to integers array
class VariableLength
{
    [VectorType]
    public int[] VariableLengthFeatures { get; set; }
}

 

Next class

//This class defines the variable length string
class FixedLength
{
    // the const FeatureLength fixed is also implicitly static
    [VectorType(Program.FeatureLength)]
    public int[] Features { get; set; }
}

 

Next class:

// Prediction output by the model
class SentimentPrediction
{
    [VectorType(2)]
    public float[] Prediction { get; set; }
}

 

6. Create the MLContext, the lookup Dictionary and Resize action

We would now create the MLContext in the main method. The MLContext is simply an environment for working with machine learning models. Simply create an MLContext using the line below:

MLContext mlContext = new MLContext();

 

The next code is the lookup map that maps the text to array of integers using dictionary provided as .csv file

// We now create a dictionary and load the mapping data
var lookupMap = mlContext.Data.LoadFromTextFile(Path.Combine(_modelPath, "imdb_word_index.csv"),
    columns: new[]
    {
        new TextLoader.Column("Words", DataKind.String, 0),
        new TextLoader.Column("Ids", DataKind.Int32, 1),
    },
    separatorChar: ','
    );

 

We now write the code to resize the variable length array to fixed length array:

// We now need to resise the variable length integer array into fixed size of 600
Action<VariableLength, FixedLength> ResizeFeaturesAction = (s, f) =>
{
    var features = s.VariableLengthFeatures;
    Array.Resize(ref features, FeatureLength);
    f.Features = features;
};

 

7. Load the TensorFlow Model

We would now load the TensorFlow model using the LoadTensorFlowModel() method from the mlContext.model

// Next we load the pre-trained TensorFlow model
// Before you can do this, you need to add the ML.TensorFlow package using PackageManager
TensorFlowModel tensorFlowModel = mlContext.Model.LoadTensorFlowModel(_modelPath);

 

Let’s display some data about the  model we loaded

// Next we extract the input and output schema from the TensorFlow model and display them to the output
/*The input schema is the fixed-length array of integer encoded words. 
    * The output schema is a float array of probabilities 
    * indicating whether a review's sentiment is negative, or positive 
    */
DataViewSchema schema = tensorFlowModel.GetModelSchema();

Console.WriteLine(" =============== TensorFlow Model Schema =============== ");

var featuresType = (VectorDataViewType)schema["Features"].Type;
Console.WriteLine($"Name: Features, Type: {featuresType.ItemType.RawType}, Size: ({featuresType.Dimensions[0]})");

var predictionType = (VectorDataViewType)schema["Prediction/Softmax"].Type;

Console.WriteLine($"Name: Prediction/Softmax, Type: {predictionType.ItemType.RawType}, Size: ({predictionType.Dimensions[0]})");
Console.WriteLine("Press Enter to continue...");
Console.ReadLine();

 

8. Create a Learning Pipeline

This we do below. I have added all the necessary annotations to explain different parts of the code

// We now create a learning pipeline (you already know this)
IEstimator<ITransformer> pipeline =
    //Split the text into words using spaces
    mlContext.Transforms.Text.TokenizeIntoWords("TokenizedWords", "ReviewText")

// We now map the words into their corresponding integer values
.Append(mlContext.Transforms.Conversion.MapValue("VariableLengthFeatures",
        lookupMap, lookupMap.Schema["Words"], lookupMap.Schema["Ids"], "TokenizedWords"
))

//Next we resize the variable length encoding into fixed size length
.Append(mlContext.Transforms.CustomMapping(ResizeFeaturesAction, "Resize"))

//Classify the input with the model
.Append(tensorFlowModel.ScoreTensorFlowModel("Prediction/Softmax", "Features"))

// Now create new output column to hold the prediction. This prediction is retrieved from the TensorFlowModel
.Append(mlContext.Transforms.CopyColumns("Prediction", "Prediction/Softmax"));

 

The next step where we need to create a model is also explained via annotations

// We now create a new ML model from the pipeline 
//remember the original model was from TensorFlow)
IDataView dataView = mlContext.Data.LoadFromEnumerable(new List<MovieReview>());
ITransformer model = pipeline.Fit(dataView);

// Call the PredictSentiment Method to make a prediction 
PredictSentiment(mlContext, model);

 

9. Make Prediction Using the Model

Finally we make prediction using the model and display the results to the output.

//We are almost there. 
//We have to use the model to make prediction. We create a method
public static void PredictSentiment(MLContext mlContext, ITransformer model)
{
    //Create a prediction engine
    var engine = mlContext.Model.CreatePredictionEngine<MovieReview, SentimentPrediction>(model);

    var review = new MovieReview()
    {
        ReviewText = "This is an interesting movie!"
    };

    var sentimentPrediction = engine.Predict(review);

    Console.WriteLine("Number of classes: {0}", sentimentPrediction.Prediction.Length);
    Console.WriteLine("Is sentiment/review positive? {0}", sentimentPrediction.Prediction[1] > 0.5 ? "Yes." : "No.");
    Console.WriteLine("Press eny key to exit...");
    Console.ReadLine();
}

We are done!

Just run the program and you will have the output shown below. If it ran successfully, then congrats!.

Sentiment Analysis with TensorFlow Output

Sentiment Analysis with TensorFlow Output