An analog-AI chip for energy-efficient speech recognition and transcription

Posted by

Can Zoom use your meetings to train AI?

ai recognition

The first is the use of a dense and efficient circuit-switched 2D mesh to exchange massively parallel vectors of neuron-activation data over short distances. The second is the successful implementation of DNN models that are large enough to be relevant for commercial use and are demonstrated at sufficiently high accuracy levels. With the help of AI, a facial recognition system maps facial features from an image and then compares this information with a database to find a match.

ai recognition

Image recognition helps self-driving and autonomous cars perform at their best. With the help of rear-facing cameras, sensors, and LiDAR, images generated are compared with the dataset using the image recognition software. It helps accurately detect other vehicles, traffic lights, lanes, pedestrians, and more. The AI is trained to recognize faces by mapping a person’s facial features and comparing them with images in the deep learning database to strike a match.

Inside view: Safer than humans

AI-based image recognition is the essential computer vision technology that can be both the building block of a bigger project (e.g., when paired with object tracking or instant segmentation) or a stand-alone task. As the popularity and use case base for image recognition grows, we would like to tell you more about this technology, how AI image recognition works, and how it can be used in business. The terms image recognition and computer vision are often used interchangeably but are actually different. In fact, image recognition is an application of computer vision that often requires more than one computer vision task, such as object detection, image identification, and image classification.

ai recognition

We also clipped the MFCCs to the range (−30, 30) to avoid any potential activation-rescaling problems going into our HW. This preprocessing resulted in a two-dimensional MFCC fingerprint for each keyword with dimensions of 49 × 40 (Extended Data Fig. ai recognition 3c), and this is then flattened to give a 1,960-input vector. We also randomly shifted keywords by 100 ms and introduced background noise into 80% (the majority) of the training samples to make keyword detection more realistic and resilient.

Vision AI

Datasets have to consist of hundreds to thousands of examples and be labeled correctly. In case there is enough historical data for a project, this data will be labeled naturally. Also, to make an AI image recognition project a success, the data should have predictive power.

To compare the weight programming in the five chips used for the RNNT experiment, we calculated the CDF on the basis of the data shown in Extended Data Fig. In this way, two data points were extracted for each tile, one for WP1 and one for WP2. The chip analog yield, measured as the fraction of weights with a programming error of ai recognition less than 20% of the maximum weight magnitude, is around 99% (Extended Data Fig. 4e). Chip 4 has a slightly lower yield because the corresponding maximum W, defined as the coefficient used to rescale weights from MLPerf (around [−1, 1]) to integers, is larger because more signal was required, causing greater weight saturation.

Leave a Reply

Your email address will not be published. Required fields are marked *

eight + 12 =