Tavant Logo

Bringing gpt-2 to android with kerasnlp: odml guide

Share to

Android developers and AI enthusiasts are exploring the prospect of running powerful language models like GPT-2 directly on your Android device. The KerasNLP workshop from IO2023 has all the insights one might need to make it happen. Here’s a detailed guide to integrating GPT-2 as an On-Device Machine Learning (ODML) model on Android using KerasNLP.

Why use ODML on Android?

On-device machine learning offers several benefits:

  1. Latency: No need to wait for server responses.
  2. Privacy: Data stays on the device.
  3. Offline Access: Works without internet connectivity.
  4. Reduced Costs: Lower server and bandwidth costs.

 

Setting up the environment:

The first requirement in setting up an environment is the need for a robust setup on your development machine. Developers need to make sure they have Python installed along with TensorFlow and KerasNLP. Install KerasNLP using:

pip install keras-nlp

Loading and Preparing GPT-2 with KerasNLP

KerasNLP simplifies the process of loading pre-trained models. For the developers’ purposes, they should load GPT-2 and prepare it for ODML.

from keras_nlp.models import GPT2
model = GPT2.from_pretrained(‘gpt2’)

Fine-tuning GPT-2:

To make the model more relevant for one’s Android application, fine-tuning on a specific dataset is recommended.

# Example of fine-tuning the model
model.fit(dataset, epochs=3)

Converting the model for Android:

Once the model is fine-tuned, the next step is to convert it into a TensorFlow Lite (TFLite) format, which is optimized for mobile devices.

import tensorflow as tf


converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model to a file

with open(‘model.tflite’, ‘wb’) as f:
f.write(tflite_model)

Integrating the TFLite model in Android:

Step 1: Add TensorFlow Lite dependency
Add the TensorFlow Lite library to your build.gradle file.
implementation ‘org.tensorflow:tensorflow-lite:2.7.0’

Step 2: Load the model in the Android app

Place the model.tflite file in the assets directory and write code to load and run the model using Kotlin.

suspend fun initModel(){
withContext(dispatcher) {
val loadResult = loadModelFile(context) // Load the model file


// Check if loading was successful
if (loadResult.isFailure) {
val exception = loadResult.exceptionOrNull()
return@withContext when (exception) {
is FileNotFoundException ->
//Handle FileNotFoundException
else ->
//Handle Exception
}
}


// Initialize the interpreter with the loaded model
val model = loadResult.getOrNull()
isInitialized = model?.let {
interpreter = Interpreter(it)
}
}
}

Running inference:

Prepare your input data and call the runInterpreter method to get predictions.

@WorkerThread
private fun runInterpreter(input: String): String {
private val outputBuffer = ByteBuffer.allocateDirect(OUTPUT_BUFFER_SIZE)

  // Run interpreter, which will generate text into outputBuffer
interpreter.run(input, outputBuffer)

  // Set output buffer limit to current position & position to 0
outputBuffer.flip()

  // Get bytes from output buffer
val bytes = ByteArray(outputBuffer.remaining())
outputBuffer.get(bytes)


outputBuffer.clear()


// Return bytes converted to String
return String(bytes, Charsets.UTF_8)
}

Final thoughts 

Integrating ODML with KerasNLP and TensorFlow Lite can transform one’s Android device into a powerhouse for real-time NLP tasks. Whether it’s for chatbots, language translation, or content generation, the capabilities are now in the palm of your hand.

Tags :

Let’s create new possibilities with technology