上QQ阅读APP看书，第一时间看更新

Some simple applications

As a concluding paragraph of the code provisioning, we demonstrate just three simple scripts leveraging the three different sources used by our project: files, videos, webcam.

Our first testing script aims at annotating and visualizing three images after importing the class DetectionObj from the local directory (In cases where you operate from another directory, the import won't work unless you add the project directory to the Python path).

In order to add a directory to the Python path in your script, you just have to put sys.path.insert command before the part of the script that needs access to that directory:

import sys
sys.path.insert(0,'/path/to/directory')

Then we activate the class, declaring it using the SSD MobileNet v1 model. After that, we have to put the path to every single image into a list and feed it to the method file_pipeline:

from TensorFlow_detection import DetectionObj
if __name__ == "__main__":
    detection = DetectionObj(model='ssd_mobilenet_v1_coco_11_06_2017')
    images = ["./sample_images/intersection.jpg",
              "./sample_images/busy_street.jpg", "./sample_images/doge.jpg"]
    detection.file_pipeline(images)

The output that we receive after our detection class has been placed on the intersection image and will return us another image enriched with bounding boxes around objects recognized with enough confidence:

Object detection by SSD MobileNet v1 on a photo of an intersection

After running the script, all three images will be represented with their annotations on the screen (each one for three seconds) and a new JSON file will be written on disk (in the target directory, which corresponds to the local directory if you have not otherwise stated it by modifying the class variable TARGET_CLASS).

In the visualization, you will see all the bounding boxes relative to objects whose prediction confidence is above 0.5. Anyway, you will notice that, in this case of an annotated image of an intersection (depicted in the preceding figure), not all cars and pedestrians have been spotted by the model.

By looking at the JSON file, you will discover that many other cars and pedestrians have been located by the model, though with lesser confidence. In the file, you will find all the objects detected with at least 0.25 confidence, a threshold which represents a common standard in many studies on object detection (but you can change it by modifying the class variable THRESHOLD).

Here you can see the scores generated in the JSON file. Only eight detected objects are above the visualization threshold of 0.5, whereas 16 other objects have lesser scores:

"scores": [0.9099398255348206, 0.8124723434448242, 0.7853631973266602, 0.709653913974762, 0.5999227166175842, 0.5942907929420471, 0.5858771800994873, 0.5656214952468872, 0.49047672748565674, 0.4781857430934906, 0.4467884600162506, 0.4043623208999634, 0.40048354864120483, 0.38961756229400635, 0.35605812072753906, 0.3488095998764038, 0.3194449841976166, 0.3000411093235016, 0.294520765542984, 0.2912806570529938, 0.2889115810394287, 0.2781482934951782, 0.2767323851585388, 0.2747304439544678]

And here you can find the relative class of the detected objects. Many cars have been spotted with lesser confidence. They actually may be cars in the image or errors. In accordance with your application of the Detection API, you may want to adjust your threshold or use another model and estimate an object only if it has been repeatedly detected by different models above a threshold:

"classes": ["car", "person", "person", "person", "person", "car", "car", "person", "person", "person", "person", "person", "person", "person", "car", "car", "person", "person", "car", "car", "person", "car", "car", "car"]

Applying detection to videos uses the same scripting approach. This time you just point to the appropriate method, video_pipeline, the path to the video, and set whether the resulting video should have audio or not (by default audio will be filtered out). The script will do everything by itself, saving, on the same directory path as the original video, a modified and annotated video (you can spot it because it has the same filename but with the addition of annotated_ before it):

from TensorFlow_detection import DetectionObj
if __name__ == "__main__":
    detection = DetectionObj(model='ssd_mobilenet_v1_coco_11_06_2017')
    detection.video_pipeline(video="./sample_videos/ducks.mp4", audio=False)

Finally, you can also leverage the exact same approach for images acquired by a webcam. This time you will be using the method webcam_pipeline:

from TensorFlow_detection import DetectionObj
if __name__ == "__main__":
    detection = DetectionObj(model='ssd_mobilenet_v1_coco_11_06_2017')
    detection.webcam_pipeline()

The script will activate the webcam, adjust the light, pick a snapshot, save the resulting snapshot and its annotation JSON file in the current directory, and finally represent the snapshot on your screen with bounding boxes on detected objects.