1. Introduction
Before understanding what face search is, what the use cases are, and why performing face search fast is so crucial, let us understand the following two key terms used in this domain:
- Face Verification: This is a one-to-one comparison of faces to confirm the individual’s identity by comparing his/her face against a face or face template stored in the identity card or captured directly by the camera by clicking the image on the card. An example is when an organisation authenticates the user by comparing the image stored in the offline eKYC XML of Aadhaar with the face captured through a camera. This face capture can happen through cameras mounted at the entry point or may be captured by any web application using a computer camera. Other use cases may be, for example, online banking or passport checks. In the case of face verification, comparison of the faces is one-to-one.
- Face Recognition: The purpose of face recognition is to identify/recognise the person from a database of faces by performing a one-to-many comparison.
Face images are not directly compared; rather, there are many deep learning-based models to transform these faces into embeddings. These embeddings are nothing but a vector, which is a mathematical representation of the face in the embedding space, learnt by the model. By simply calculating the distance metric, such as cosine similarity, and comparing it with a certain threshold, we can tell if the two faces belong to the same person or not. There are other distance metrics such as Dot Product, Squared Euclidean, Manhattan, Hamming, etc.
There are many use cases where there could be millions, even billions, of images in the database for comparison. One-to-many comparisons against this huge number of images are unimaginable in real-time use cases.
In this article and accompanying code, I have used Facebook AI Similarity Search (Faiss), a library that helps in quickly searching across multimedia documents that are similar to each other. The first step is data ingestion, where multimedia documents (a face image in this case) are transformed into vector embeddings and then saved in the database. Once queried, this database returns the k-nearest neighbours of the queried face, that is, k faces that are most similar to the queried face images. Other competing vector databases provide similar functionality. Read more about Faiss in the article “Faiss: A library for efficient similarity search“.
2. Data Ingestion
I used Labelled Faces in the Wild (LFW) dataset, which has over 13,000 images of faces collected from the web. The face images are stored in a directory with the same name as the person whose face images they belong to. All these directories are located in a directory named lfw-deepfunneled
. The following is the code snippet to
- Load the face images from the directory.
- Transform the loaded face images to face embeddings.
To perform both operations, I used the face-recognition
library. This Python library is built using dlib’s state-of-the-art face recognition. The loading step additionally detects the face region in the original face image, crops it, and then returns. The transformation step transforms the cropped face into a vector embedding. Following is the code snippet for the same. representations
is the list of the list of key, value pairs. The key is the file name, and the value is the corresponding vector embedding. embeddings
is the list that stores all the vector embeddings.
representations = []
path_dataset = "lfw-deepfunneled"
dirs = os.listdir(path_dataset)
dirs.sort()
count = 1
for dir in dirs:
file_names = os.listdir(path_dataset + "/" + dir)
for file_name in file_names:
full_path_of_image = os.path.join(path_dataset, dir, file_name)
print(f"Count: {count}, Image path: {full_path_of_image}")
loaded_image = face_recognition.load_image_file(full_path_of_image)
image_embedding = face_recognition.face_encodings(loaded_image)
if len(image_embedding) > 0:
image_embedding = image_embedding[0]
if len(image_embedding) > 0:
representations.append([file_name, image_embedding])
count = count + 1
embeddings = []
for key, value in representations:
embeddings.append(value)
print("Size of total embeddings: " + str(len(embeddings)))
The next step is to initialise the Faiss database and then store the vector embedding in it. Then, serialise the database on the disc. Finally, serialise the representations
list on the disc. The intent is that when the face search module starts, it loads the serialised index and list in memory. Following is the code snippet:
# Initialize vector store and save the embbeddings
print("Storing embeddings in faiss.")
index = faiss.IndexFlatL2(128)
index.add(np.array(embeddings, dtype = "f"))
# Save the index
faiss.write_index(index, "face_index.bin")
# Save the representations
with open('face_representations.txt', 'wb') as fp:
pickle.dump(representations, fp)
print("Done")
3. Face Search
The following are the steps for face search:
- Load the database; load the
representations
list. - Create a search interface (web interface using streamlit in this case)
- Upload the query face image, crop the face, and transform it into a vector embedding
- Pass the query vector embedding to the Faiss database
- Faiss database returns the k nearest neighbours from the database.
- Perform 1 to k comparisons (similarity check) of the query face with k face embeddings returned from the database.
- Based on the comparison of this similarity value with a certain threshold, it is decided whether the person is found or not. If found, then show the face images found.
Following is the code snippet:
is_dataset_loaded = False
# Load the face embedding from the saved face_representations.txt file
def get_data():
with st.spinner("Wait for the dataset to load...", show_time=True):
representations = None
with open ('face_representations.txt', 'rb') as fp:
representations = pickle.load(fp)
print(representations)
# Load the index
face_index = faiss.read_index("face_index.bin")
return representations, face_index
# Load the face embedding at the startup and store in session
if st.button('Rerun'):
st.session_state.representations, st.session_state.index = get_data()
if 'index' not in st.session_state:
st.session_state.representations, st.session_state.index = get_data()
index = st.session_state.index
representations = st.session_state.representations
# Search web interface
with st.form("search-form"):
uploaded_face_image = st.file_uploader("Choose face image for search", key="search_face_image_uploader")
if uploaded_face_image is not None:
tic = time.time()
st.text("Saving the query image...")
print("Saving the query image in the directory: " + "query-images")
random_query_image_name = uuid.uuid4().hex
query_image_full_path = "query-images/" + random_query_image_name + ".jpg"
with open(query_image_full_path, "wb") as binary_file:
binary_file.write(uploaded_face_image.getvalue())
st.image(uploaded_face_image, caption="Image uploaded for search")
query_image = face_recognition.load_image_file(query_image_full_path)
query_image_embedding = face_recognition.face_encodings(query_image)
if len(query_image_embedding) > 0:
query_image_embedding = query_image_embedding[0]
query_image_embedding = np.expand_dims(query_image_embedding, axis = 0)
# Search
st.text("Searching the images...")
k = 1
distances, neighbours = index.search(query_image_embedding, k)
#print(neighbours)
#print(distances)
i = 0
is_image_found = False
for distance in distances[0]:
if distance < 0.3:
st.text("Found the image.")
st.text("Similarity: " + str(distance))
image_file_name = representations[neighbours[0][i]][0]
image_path = "lfw-deepfunneled/" + image_file_name[:-9] + "/" + image_file_name
st.image(image_path)
is_image_found = True
i = i + 1
if is_image_found == False:
st.text("Cound not found the image.")
toc = time.time()
st.text("Total time taken: " + str(toc - tic) + " seconds")
st.form_submit_button('Submit')
Other Details
Complete code is available at Github.
Dependent Libraries:
pip install face-recognition
pip install faiss
pip install pickle
pip install streamlit
Steps to Run the Application
pip install -r /path/to/requirements.txt
python data_ingestion_2_vector_db.py
streamlit run WebApp.py
Screenshot of the application:
