Background
I have a well-trained ssd320x320 tensorflow model from tensorflow-zoo. The reports are pretty good as the train log indicates a low loss and the eval log indicates that 7 out of 9 test images were detected successfully. The model was trained with GPU and was saved as ckpt3.
The goal is to detect when a person "likes" with their hand.
Problem
Loading a model from its last checkpoint works well, and I achieved detection with the following function:
def test1(self):
# Works great
for img_path in glob.glob("test_dir\*.jpg"):
plt.figure()
plt.imshow(self.get_image_np_with_detections(self._load_image_into_numpy_array(img_path)))
plt.show()
# Note that get get_image_np_with_detections() is the detection @tf.function()
# as it is written in tensorflow documentation, with no changes.
# _load_image() function simply returns np.array(Image.open(path))
Object detection in image was successfuly achieved in test1.
Problem is that I failed to detect an object in webcam frames.
From another function, which opens my webcam, I call the same detection function for each frame. This function is failing, as not even one green detection box appears on the screen.
def open_webcam(self):
# Doesn't show detection green boxes at all
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, image_np = cap.read()
im_detected = self.get_image_np_with_detections(image_np)
cv2.imshow('object detection', cv2.resize(im_detected, (800, 600)))
# release, destroy...
Where is the problem
During my debug, I have saved screenshots from my webcam, while running the open_webcam() function (took a screenshot every 1-2 seconds). The screenshots were saved into test_dir, and then were processed to test1. The test was successful as all screenshots were marked with a green detection box (hand-like-sign).
This test indicates that the problem regards the way I pass frames to the function, as all the frames were successfully detected in the test1 approach, but not in real-time. To summarize:
- I failed to detect a like-sign in a webcam frame (real-time).
- I saved the frame inside
test_dir, with a unique-id. - I managed to detect a like-sign after opening the
jpg, intest1()(9/10 screenshots).
I have tried to…
- pass frames as
numpy arraywith no luck. - Expand the dimensions as mentioned in tf-documentation, again, with no luck.
Note that…
- I have only 1 label, which is
Like(hand-sign). - I used around 25 train images, and 9 test images.
- As mentioned, the model works great when opening saved
jpgfiles. Eval report looks good. - PY is
3.7, TF is2.7, CV is4.5.5.
Thanks in advance!
>Solution :
TensorFlow model is most likely to be trained on RGB images, while cv2 works with BGR. Try
image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
Also, model may be trained on normalized images, so, if changing BGR to RGB doesn’t help, try
image_np = image_np / 255.