Create high-quality instructions for Amazon SageMaker Ground Truth labeling jobs

Amazon SageMaker Ground Truth helps you quickly build highly accurate training datasets for machine learning (ML). You can use your own workers, a choice of vendor-managed workforces that specialize in data labeling, or a public workforce powered by Amazon Mechanical Turk to provide the human-generated labels. To get high-quality labels, you must provide simple, concise, and clear instructions, especially when using a public workforce. Writing good instructions is the single most important action you can take to improve annotation quality. It’s worth investing the time to do it right.

This blog post shares best practices for creating highly effective instructions for a public workforce. There are two key points: reduce the cognitive load for the workers as much as possible, and experiment early in the process to fine-tune your instructions and save yourself trouble later on. You can experiment by labeling some of your data yourself and by submitting small jobs to the public workforce throughout the process.

The following screenshot shows an example of a Ground Truth bounding box labeling task with good instructions from the worker’s perspective. In this example task, we ask workers to draw boxes around flowers in images taken from the Google Open Images Dataset. The left side of image shows the short instructions that are constantly visible in a sidebar while the worker is annotating. They are clear, to the point, specialized to the task, and focused on example images.

The following figure shows an example of the full instructions that a worker can see by choosing View full instructions in the sidebar. They clarify ambiguities that could confuse the worker. By the end of this post, you’ll be able to create high-quality instructions for your own labeling job.

Our recommended workflow

The quickest way to create good instructions is use the tools provided by Ground Truth to annotate some of your own data. You can then use the results as examples in your instructions. To do this, you should take the following steps:

  1. Select a small number of examples from your data.
  2. Run a private job on Ground Truth to label your chosen examples.
  3. Create the short instructions using your results. Focus on example images and small amounts of text.
  4. Create the full instructions to clarify ambiguities in the task.
  5. Run a small public job to test the instructions. Iterate on the results until you are satisfied.
  6. Consider simplifying your task, and set a reasonable price.

Note: Running the private labeling jobs will cost $0.08 per example. For pricing details, see the Amazon SageMaker Ground Truth pricing page.

After you have produced high-quality instructions, you can send your full labeling job out to the public workforce. Let’s go over each step in the checklist.

Select a small number of examples from your data

Browse your dataset and select examples that capture the variety in your data. Choosing examples from the items you want to label (as opposed to generic examples) ensures the instructions will help annotators understand your specific task.

Here, we select images with different numbers of flowers of various shapes and sizes. The flowers in some of these images are hidden behind others or touch the edge of the frame. Choosing a variety of cases makes it easier to find good examples for creating the instructions. It also gives you insight into the difficulty of the task from the worker’s point of view.

Run a private job on Ground Truth to label your chosen examples

A previous blog post described how to run a labeling job using the AWS Management Console. You should follow the method described there to label the examples you chose from the previous section. You need to add the images you have selected to a manifest file, create a private work team with your own email address, and select one annotator per example. There’s no reason for you to label the same example multiple times.

Running this private job gives you perspective on what you want to accomplish with your labeling job, the difficulty of the task, and the tools the annotators will be using. Make a record of the examples that were difficult or ambiguous as you work. You will need these later to write the full instructions. In addition, you should consider timing yourself to gauge how much to pay the workers for your task.

The left figure shows a preview of the bounding box tool at work. Notice that the instructions on the left side of the image have not yet been created. The right figure demonstrates the results from the private labeling job.

Create the short instructions using your results

After you finish the private labeling job, you can find the results in the Amazon SageMaker console by going to Labeling jobs and selecting the name you gave the job. The annotated examples are at the bottom of the page. For image labeling tasks, the simplest way to extract the results is to zoom in on these annotated images and take screenshots.

Ideally, narrow your results to one or two exemplary “good” instances, then create one or two images with various bad annotations illustrating what you expect to be the most common sources of failure. You can do this by re-running the private labeling job and skipping all the other examples. Alternatively, you can combine examples of good and bad annotations in a single example image to help the workers quickly understand the task. One particularly inventive strategy is to use an animated GIF that alternates between good and bad examples. For the flower labeling instructions, we use the following images for the good and bad examples, respectively:

After you have selected the example instances and extracted the results, use your favorite image editing software (such as Google Drawings, GIMP, Keynote, or PowerPoint) to put the finishing touches on the figures for your instructions. For example, you might consider placing Xs over images representing incorrect annotations.

Upload your images to an Amazon S3 bucket

Upload the images to an Amazon S3 bucket and set the object permissions so that the images are publicly available. If your S3 bucket has the default permissions, you’ll have to first change the public access settings for the bucket to allow the images to be publicly available. We strongly recommend against making the entire bucket publicly accessible. To make it possible for the images to be public, go to the Amazon S3 console, select your bucket, and choose the Permissions tab. You should see something similar to the following image:

Choose Edit, then uncheck the first two boxes. Choose Save.

A confirmation dialog box appears. Type “confirm” in the appropriate field and choose Confirm to update the public access settings.

To finish uploading the image, return to your S3 bucket overview by choosing Overview. Then choose Upload, drag and drop the file into the dialog box, and then choose Upload in the dialog box. Finally, select the image name from the S3 bucket overview and choose Make public to make the image publically accessible from the internet.

If your bucket permissions have been set correctly, a message saying Success appears.

Finally, we recommend returning to the bucket permissons tab and re-checking the first box, Block new public ACLs and uploading public objects. This prevents you from accidentally making a different object public in the future.

Use the instruction-making tool to finish creating the instructions

Finally, go to the instruction-making tool in the job creation section of the Amazon SageMaker console, create your instructions, and link to the images you gathered in your S3 bucket. You can place your images in the short instructions by choosing the image icon in the instructions tool and entering the object URL, which you can find in the S3 bucket overview by selecting the image name.

After you have added the image, you’ll see a thumbnail in the instruction-making tool.

If you instead see a broken image link icon like the one on the right in the preceding figure, double-check that you have correctly set the bucket and object permissions by following the steps in the previous section.

Many workers will only read the short instructions, so make them count. Focus on your example images, with a small amount of explanatory text in simple English. Use short sentences. Remember, the annotators are not always fluent in English, and ambiguous instructions lead to ambiguous results. Your goal is to be as explicit as possible while keeping things simple.

Create the full instructions to clarify ambiguities in the task

After you have finished writing the short instructions, choose Additional instructions in the instruction-making tool to begin working on the full instructions. Here are some points to keep in mind:

  • The full instructions should clarify ambiguities in your task. Often, annotators will only consult these if they are confused. Use your experience from the private job to anticipate sources of confusion.
  • Try not to repeat the short instructions.
  • Catching every edge case at the expense of having pages and pages of instructions is usually a mistake. In our experience, two or three additional good/bad example pairs should suffice, and further instructions yield diminishing returns.

The following figure shows the final instructions for the flower example.

Run a small public labeling job to test the instructions

After you complete the first draft of the instructions, you can create and submit a small public labeling job. Inspect the results, and look for common mistakes that aren’t addressed in the current version of the instructions. Workers often make mistakes that are different from the ones that you anticipate. It’s better to catch these early in the process than to run a large and expensive labeling job twice. You can continue to repeat this process until the results are satisfactory.

Consider simplifying your task and set a reasonable price

If your instructions are still too long, too complex, or are missing difficult examples from your data, think about how to split your task into several simpler ones. You might have noticed this image in our selection of examples:

Asking workers to label images like this for the same price as the other examples is a recipe for failure. In this case, you might first perform an image classification job to estimate the number of flowers in each image. Then, you can go back and subdivide the images with many flowers so no single image is too challenging.

As another example, consider a job that asks workers to label flowers, people, and dogs in each image. In this case you might get better results by launching three jobs, each focused on a single category. You can run these jobs in parallel or one after another and then combine the results.

As the final step in the process of creating the instructions, use your newly gained experience labeling the examples yourself to set a reasonable price for your tasks. The job creation section of the Amazon SageMaker console allows you to choose a payment for each labeled example using a drop-down menu:

You can use your records of the amount of time it took to complete the labeling jobs for the instructions together with the suggestions in the menu to select an appropriate reward.

Conclusion

Instructions specific to your data will always be superior to generic ones. Creating them might be time-consuming, but the workers will appreciate your effort. They want to complete your task as quickly as possible, and making their lives easier will improve your results.

Here are some resources if you would like to learn more about Ground Truth and making instructions for a public workforce:

Disclosure regarding the Open Images Dataset V4

Open Images Dataset V4 is created by Google Inc. In some cases we have modified the images or the accompanying annotations. You can obtain the original images and annotations here. The annotations are licensed by Google Inc. under CC BY 4.0 license. The images are listed as having a CC BY 2.0 license. The following paper describes Open Images V4 in depth: from the data collection and annotation to detailed statistics about the data and evaluation of models trained on it.

A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, T. Duerig, and V. Ferrari. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. arXiv:1811.00982, 2018. (link to PDF)


About the Authors

Tristan McKinney is an applied scientist in the Amazon ML Solutions Lab. He recently completed his PhD in theoretical physics at Caltech where he studied effective field theory and its application to high-T_c superconductors. As his father was in the US Army, he lived all over the place when growing up, including Germany and Albania. In his spare time, Tristan loves to ski and play soccer.

 

 

Krzysztof Chalupka is an applied scientist in the Amazon ML Solutions Lab. He has a PhD in causal inference and computer vision from Caltech. At Amazon, he figures out ways in which computer vision and deep learning can augment human intelligence. His free time is filled with family. He also loves forests, woodworking, and books (trees in all forms).

 

 

 Fedor Zhdanov is a Machine Learning Scientist at Amazon. He works on developing Machine Learning algorithms and tools for our internal and external customers.