Скачать книгу

to achieve the goals of the AI. The first area it observes is user behavior via metadata. It determines things about a video based on the behavior of the person whose eyes are on the screen and whose fingers are doing the clicking. “Satisfaction signals” train the AI what to suggest or not. There is a very specific list of these signals:

       Which videos a user watches

       Which videos they skip

       Time they spend watching

       Likes and dislikes

       “Not interested” feedback

       Surveys after watching a video

       Whether they come back to rewatch or finish something unwatched

       If they save and come back to watch later

      All of these signals feed the Satisfaction Feedback Loop. This loop is created based on the feedback the algorithm is getting from your specific behavior. It “loops” the types of videos you like through its suggestions. This is how it personalizes each user's experience.

      Gathering Metadata

      To really get down to the details, here's an explanation for exactly how the AI gathers data. Observing metadata starts with the thumbnail. The YouTube AI uses the advanced technology of Google's suite of AI products. It operates a program called Cloud Vision (CV). CV uses optical character recognition (OCR) and image recognition to determine lots of things about a video based on what it finds in the thumbnail. It takes points from each image in the thumbnail and, using billions of data points already in the system, recognizes those images, and feeds that information back into the algorithm. For example, a thumbnail including a close‐up of world‐renowned physicist Stephen Hawking's face is recognized as such in CV, so that video can be “grouped” in the suggested feed along with every other video on YouTube that has been tagged under the Stephen Hawking topic. This is how your videos get discovered and watched.

Snapshot depicts the thumbnail with data points

      Video Intelligence

      Closed Captioning

      The AI does the same thing with the language of the video. YouTube has an auto‐caption feature now, and the AI reads through the words of the caption to gather data as well. So basically going through the video frames using shot lists is like looking at what is visually being said, while listening to the audio provides even more feedback via what is actually being verbalized. Everything goes into the system.

      Natural Language

      The AI is also listening for actual sentence structure and breaking it down into a sentence diagram. This extracts the meaning of what is being said. It can differentiate language so it can group it categorically, but not just on the surface. For example, two different creators might both talk about Stephen Hawking in their videos, but one video might be biographical or scientific while the other might be humorous or entertaining. Even though both videos are talking about the same person, they are categorically different enough that the AI would categorize them differently and group them with different recommended content because of the language being used.

      Video Title and Description

      Did you know that YouTube has more than one algorithm? The AI uses multiple systems, and each has its own objective and goal. The surface features viewers see are:

       Browse Features: Homepage and Subscription

       Suggested

       Trending

       Notification

       Search

      Additionally, YouTube is constantly running experiments—several thousand a year—and they implement about 1 in 10 changes as they go, so this translates to hundreds of changes being implemented annually. These changes help the system get smarter, and smarter means better at feeding viewers what they will watch.

      Browse: Homepage

      YouTube's Homepage has changed over time. Users no longer have to type a query in Search or to put in the work to navigate. The Homepage used to be where users saw only video recommendations of channels they had subscribed to. Now the Homepage has a personalized

Скачать книгу