WeCat AI - Cat Database for AI Voice Recognition
Create a comprehensive cat voice library to enable an AI based system to improve human-pet communication and enhance human mental health.
Using Audacity to label cat audio
Team plays with Sally during meeting break
Hermione's Happiest Time
Team works together to produce video for this competition
Additional categories (optional)
Are you employed by T-Mobile or related to an employee of T-Mobile?
Eligibility: Date of Birth
Help us stay in touch!
Address: California: Saratoga (95070), 13252 Glasgow Court
Date You Started Your Project Started
Project Stage: Select the description below that best applies to your approach.
Growth (have moved past the very first activities; working towards the next level of expansion)
1. The Problem: What problem are you helping to solve?
Our mission is to solve the communication barrier between humans and cats. Research has shown that pets can play an essential part of many people’s lives by helping them relieve stress and providing happiness. The challenge is that humans cannot understand cats well enough, which affects cat's own mental health negatively. Developing an effective communication tool between human and cats would help both in their happiness and health.
2. Your Solution: How are you planning to solve this problem? Share your specific approach.
There are two parts of the project to solve this problem, a high quality database and an accurate AI model. There is no cat voice database with accurate labeling, which has been a bottleneck in this research area. To do this we first need a lot of manually classified data. We spent months collecting data across the Internet and rigorously labelled the data to our benefit. At the time of this submission we have collected and labelled over 8000 audio clips. It is very important and also difficult to produce high quality labels. We will continue to expand the database by following these steps: 1) understand cat voice theory; 2) Use video information for voice labeling, 3) cross check among team members, and 4) choose test samples and run AI model for feedback. For the AI model, we plan to use Convolutional Neural Network (CNN), to process the actions and speech of a cat and produce a translation that humans can understand. We need to convert our labeled data into a compatible format to feed into the Neural Network. The preprocess converts the audio clips into pictures using a digital signal processing technique - MelSpectrogram, which the CNN program can read and process.
3. Personal Journey: What’s the story behind why you decided to start this project?
Our team was started by Vincent, whose personal story inspired the idea for our project. Ever since Vincent was in elementary school, his mother had suffered various health issues including depression. After the adoption of two cats, Sebastian and Sally, Vincent’s mother’s mood lifted tremendously through the touch and interactions with her cats, whom she came to consider her own “son and daughter.” She spent a lot of time with them and after a while was able to communicate with her cats and understood what the cats needed. However, not everyone can communicate with cats as effectively as she who raised cats when she was a teenager. The biggest challenges are to understand what the cat is saying and to build a strong bond with them. This pushed Vincent to create a project that should help many other people communicate effectively with cats and have fun for both humans and cats.
4. Selfie Elevator Pitch: Include 1-minute video that answers the following “I am stepping up to make change because...”
5. Example: Please walk us through a specific example of what happens when a person or group gets involved with your project.
A boy suffering from depression sits in his room in a winter day. He feels lonely and useless, and closes himself inside even though it is sunny and warm outside after days of snowing. His cat came to him and made some sound. The sound is picked up by his mobile phone and translates the cat voice into words by our WeCat AI App. Seeing the message, “Would you please take me outside?” on the screen, the boy opened the door to let the cat out. The boy watches the cat happily walk around after days of staying inside the house, and decided to sit on a couch in the garden for a while. The cat jumped on the boy’s knee, and opened its mouth “Meow”. The sentence appeared on the boy’s phone screen “Thank you! I love you!” The boy smiles and the two enjoy the warm sunshine together on a beautiful winter day.
6. The X Factor: What is different about your project compared to other programs or solutions already out there?
While research on cat voice recognition has been done in the past, our research goes much further. For the database, we have devoted significant time to learn the cat voice and behavior and worked hard to get accurate labels for the clips we collected on the Internet. We believe we have one of the best databases after comparing ours with others such as Google’s. Our recognition approach uses two different neural networks to both classify a single sound and formulate a sentence based off a collection of sounds. our prediction accuracy far exceeds that of any previous research to our knowledge.
7. Impact: How has your project made a difference so far?
We target to combine technology and social impact from the start, and we have engaged technical experts, cat owners and pet shelters. On the technical side, we’ve met with a few AI experts from various colleges, such as UC Berkeley, and Princeton. We also contacted Professor Susanne Schotz, who is a renowned expert in cat voice. These experts have taught us the newest technology as well as potential ideas that will make our project a success. We have created a training set of labeled cat videos with nine categories of cat sounds: hiss, growl, meow, purr, chatter, scream, trill, how, and call. Our first results achieve a 70% in accuracy with high hopes that we will reach 80% soon. Meanwhile, the feedback from cat owners have been overwhelming. They told us that the “translation” tool should help them very much in taking care of the cats and in getting more happiness.
8. What’s Next: What are your ideas for taking your project to the next level?
In addition to continued effort in increasing volume and quality of our database, a major effort is going to be finding a way to make our product more accessible to public. We want to create an app for our recognition model and have recently started. However, once our project is further polished, we may want to think about a different way to make it more efficient in everyday life, since cats may not repeat the same sound multiple times for us to record. One potential solution is a collar that is constantly listening to the cat and connects to the app, which translates the sound for the owner. Additionally, we are looking for service opportunities such as animal shelters where w can test our product and help the community.
9. Which of the following types of expertise would be most useful for you?
10. Finances: If applicable, have you mobilized any of the following resources so far?
Donations between $100-$1k
Help Us Support Diversity! Part 1 [optional] Which of the following categories do you identify with?
White (for example: German, Irish, English, Italian, Polish, French) (6)
Asian (for example: Chinese, Filipino, Indian, Vietnamese, Korean, Japanese, Pakistani) (9)
Help Us Support Diversity! Part 2 [optional] Do you identify as part of any of the following underrepresented communities?
No, I do not identify with an underrepresented community
How did you hear about this challenge?