Android developers and AI enthusiasts are exploring the prospect of running powerful language models like GPT-2 directly on your Android device. The KerasNLP workshop from IO2023 has all the insights one might need to make it happen. Here’s a detailed guide to integrating GPT-2 as an On-Device Machine Learning (ODML) model on Android using KerasNLP.
Why use ODML on Android?
On-device machine learning offers several benefits:
- Latency: No need to wait for server responses.
- Privacy: Data stays on the device.
- Offline Access: Works without internet connectivity.
- Reduced Costs: Lower server and bandwidth costs.
Setting up the environment:
The first requirement in setting up an environment is the need for a robust setup on your development machine. Developers need to make sure they have Python installed along with TensorFlow and KerasNLP. Install KerasNLP using:
pip install keras-nlp
Loading and Preparing GPT-2 with KerasNLP
KerasNLP simplifies the process of loading pre-trained models. For the developers’ purposes, they should load GPT-2 and prepare it for ODML.
from keras_nlp.models import GPT2
model = GPT2.from_pretrained(‘gpt2’)
Fine-tuning GPT-2:
To make the model more relevant for one’s Android application, fine-tuning on a specific dataset is recommended.
# Example of fine-tuning the model
model.fit(dataset, epochs=3)
Converting the model for Android:
Once the model is fine-tuned, the next step is to convert it into a TensorFlow Lite (TFLite) format, which is optimized for mobile devices.
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
# Save the model to a file
with open(‘model.tflite’, ‘wb’) as f:
f.write(tflite_model)
Integrating the TFLite model in Android:
Step 1: Add TensorFlow Lite dependency
Add the TensorFlow Lite library to your build.gradle file.
implementation ‘org.tensorflow:tensorflow-lite:2.7.0’
Step 2: Load the model in the Android app
Place the model.tflite file in the assets directory and write code to load and run the model using Kotlin.
suspend fun initModel(){
withContext(dispatcher) {
val loadResult = loadModelFile(context) // Load the model file
// Check if loading was successful
if (loadResult.isFailure) {
val exception = loadResult.exceptionOrNull()
return@withContext when (exception) {
is FileNotFoundException ->
//Handle FileNotFoundException
else ->
//Handle Exception
}
}
// Initialize the interpreter with the loaded model
val model = loadResult.getOrNull()
isInitialized = model?.let {
interpreter = Interpreter(it)
}
}
}
Running inference:
Prepare your input data and call the runInterpreter method to get predictions.
@WorkerThread
private fun runInterpreter(input: String): String {
private val outputBuffer = ByteBuffer.allocateDirect(OUTPUT_BUFFER_SIZE)
// Run interpreter, which will generate text into outputBuffer
interpreter.run(input, outputBuffer)
// Set output buffer limit to current position & position to 0
outputBuffer.flip()
// Get bytes from output buffer
val bytes = ByteArray(outputBuffer.remaining())
outputBuffer.get(bytes)
outputBuffer.clear()
// Return bytes converted to String
return String(bytes, Charsets.UTF_8)
}
Final thoughts
Integrating ODML with KerasNLP and TensorFlow Lite can transform one’s Android device into a powerhouse for real-time NLP tasks. Whether it’s for chatbots, language translation, or content generation, the capabilities are now in the palm of your hand.