Silicon Valley's SeeFood app using Swift's Core ML

Have you heard of the SeeFood app introduced in the series Silicon Valley? If you’re basically living under a rock and have not seen this episode, check it out here! Basically, the app should be able to determine what type of food is shown in a picture just taken. However, there was some variance with what Jian Yang, the Asian dev character in the series, has made as it was just able to determine whether it was a hotdog or not.

In this tutorial, we’d pretty much do the same, while discussing an important emerging trend - Machine Learning.

To give you a quick background, machine learning is a field of study that enables computers to learn without being explicitly being programmed. In Swift, we can integrate our machine learning to our app using Core ML.

However, there are some limitations when using Core ML. One of which is that it can’t take data and train models, and it only supports regression and classification. But, in its defense, Core ML is still fairly new, hence, has a huge room for improvement

Oh, and to make things easier for you, guys, I've provided a repository of the project, so you can check the whole code base while following the instructions below. You're welcome.

Setup

To begin, we must first create a Single-View app, which, in this case, we will name as SeeFood. Be sure to select Swift as its language. We can uncheck the Core DataUnit Test and UI Test for this as we won’t be utilizing it.

Next, download the Inception V3 Core ML Model and drag it into our Xcode project. To allow the use of camera, go to your Info.plist and add item under the Information Property List. The key should be Privacy - Camera Usage Description and we can set the value as We need to access your camera.

Once this step is done, your project is now ready for layouting.

Layout

In your Main.storyboard, embed the view controller to a Navigation Controller (Go to Editor > Embed In > Navigation Controller). Once its set, add the following:

  • UIBarButtonItem and set its System Item as Camera
  • UIImageView and set its required contraints.

Also, set the Content Mode to Aspect Fit to prevent the image from being distorted. You may want to customize your layout based on your preference, but those are fundamental elements that should be present in your app.

Enable Camera

Okay, so now you’re layout is as pretty as a unicorn on a rainbow (or maybe not). Next step would be to modify our code that would enable our camera.

First, let’s create an outlet connection of the UIImageView and name it as imageView. In addition, we also need to create an action connection of the UIBarButtonItem and name it as cameraTapped. In order to utilize the camera, declare the UIImagePickerControllerDelegate and UINavigationControllerDelegate.

Next let’s create an image picker object, by adding:

let imagePicker = UIImagePickerController()

Afterwhich, we need to set the following:

  • the delegate of our image picker inside the viewDidLoad() function:
imagePicker.delegate = self
  • and the source type of the image picker to use the camera and disable the editing of image after picking it:
imagePicker.sourceType = .camera
imagePicker.allowsEditing = false

Be sure to show the image picker once the camera button is tapped, with the following block of codes:

@IBAction func cameraTapped(_ sender: UIBarButtonItem) {
    present(imagePicker, animated: true, completion: nil)
}

Once an image has been picked, the imagePickerController(_:didFinishPickingMediaWithInfo:) delegate method will be triggered. Once it is triggered, it should update our image view with the picked image and the image picker should dismiss.

func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) {
       if let pickedImage = info[UIImagePickerControllerOriginalImage] as? UIImage {
           imageView.image = pickedImage
       }

       imagePicker.dismiss(animated: true, completion: nil)
   }

Image Recognition

After you have tested and made sure that your camera is already being utilized, let’s now proceed with the most crucial segment of this app - Image Recognition!

For us to do that, we need to import the Vision and CoreML Framework. Inside our imagePickerController(_:didFinishPickingMediaWithInfo:) delegate method and right after we set the image in our image view, let’s convert it to a CIImage to be used by our Vision and CoreML Framework in order to get an interpretation from it.

guard let ciImage = CIImage(image: pickedImage) else {
   fatalError("Could not convert UIImage to CIImage.")
}

Create a method named detect that accepts a CIImage argument. This method will contain the functionality for determining whether the image is a hotdog or not a hotdog.

func detect(image: CIImage) {}

Inside our detect method, let’s first load our Core ML model using the InceptionV3 model.

guard let model = try? VNCoreMLModel(for: Inceptionv3().model) else {
   fatalError("Loading CoreML model failed.")
}

After loading the model, we should create a request that will classify the data whether the image is a hotdog or not a hotdog. The VNCoreMLRequest returns a completion handler that gives a VNRequest and Error.  The VNRequest contains an array of VNClassificationObservation that contains all the possible classifications determined with their confidence percentages. Since the first one has the highest confidence, we will be using that to check whether the image taken is a hotdog or not a hotdog and update the UI based on the result.

let request = VNCoreMLRequest(model: model) { (request, error) in
   guard let results = request.results as? [VNClassificationObservation] else {
       fatalError("Model failed to process image.")
   }
   
  if let firstResult = results.first {
       guard let navBar = self.navigationController?.navigationBar else { fatalError() }
       
      if firstResult.identifier.contains("hotdog") {
           self.navigationItem.title = "Hotdog!"
           navBar.barTintColor = UIColor.green
       } else {
           self.navigationItem.title = "Not Hotdog!"
           navBar.barTintColor = UIColor.red
       }
   }
}

Then we perform the classification of the image.

let handler = VNImageRequestHandler(ciImage: image)
do {
   try handler.perform([request])
} catch {
   print(error)
}

Now we call our detect(image:) method after converting our image to a CIImage.

detect(image: ciImage)

Try it out! Ideally, it should work like this.

 

Now go have fun looking for hotdogs!

Blog Posts by Lorence Lim


Related Entries