The Global Dissonance & Disruption Of Music Metadata

By Corey Denis

Remember when a typo on an album meant it was a rarity? Those days are over.

Bad music metadata is a big, big problem for artists, services and the proliferation of truth in art history. Services like Spotify, Rdio, Pandora, MOG and user generated playlist services are fueled by faulty data.

Referring to any technology as disruptive without first assessing whether or not its presence is known in the act of consumption neglects high standards set by thousands of years of engineering aiming to turn technology into an act of magic, or a reasonably unseen extension of the human body. Great technology is invisible.

While streaming music is a technology breakthrough of sorts, maybe, it’s not much different than it was in the 1970s. Streaming music is still just a reworking of the same radio waves by which you ingest and cache music and, unless you pay extra, it’s loaded with advertising just as it always was. Downloadable music is a relatively new way to purchase, own, or steal music, but music has been for sale as a format for over 100 years, and the concept of making music mobile has been around since before I was born. I still have my first boombox somewhere in storage. It’s next to my yellow waterproof Walkman.

Music has been available for sale since sheet music was copied and shared with fellow musicians. The problem of metadata errors, and generally unhealthy, inaccurate music metadata spreading around the internet and into the libraries of millions of music lovers, is a muchnewer problem; a problem so large that it’s easier to ignore, but ignoring it only creates a less viable industry in the future.

Nothing screams “this music streaming product exists and lacks magic” more than a metadata error, and for fans of classical music, this is simply the way streaming music is experienced.

Unhealthy metadata affects more than searchability, it enables and encourages a general worldwide disrespect for art.

Despite incorrect spellings or inaccurate data around music in all services, the New York Times published a piece on August 25th oddly pinning down global music service Spotify as part of the problem.

But the NYT got it wrong. Faulty data comes from labels accustomed to dealing with data in CD format and user generated content sites who do not step in to correct bad data. Services are left to aggregate data from hundreds of sources without any set standard, and the data is different per genre. Music fans and artists themselves probably notice many errors in digital music today, but they may not realize almost nobody is fixing those errors because there’s just too many errors now that 10+ years of digital distribution have passed. The sad reality? There’s no priority to get it right. The money comes from elsewhere. As of now, there’s no money in the proliferation of truth in art history.

Services receiving data (music) from labels major and small have to reconcile multiple formats into one system. Most of this data comes from the Compact Disc. As of right now, there is no one industry standard for compliant data.

User generated playlist sites generally do not appear to correct bad metadata, so when another user accesses a song from content that was uploaded into the system, the problem proliferates across the web, enabling other users to consume more faulty data.

Elements of Faulty Data:

– The CD model, part one: Classical music is largely compiled into 75-minute compilations but almost every “track” has a featured composer, soloist, conductor and symphony. Compilations usually wind up listed as “various artists,” which stems from the CD model. With respect to this problem for classical music data, a better alternative might be to disengage from the CD model and reconstruct previous compilation releases into a streaming format that makes classical more searchable and obtainable, even though classical music listeners are smaller in numbers. Who the heck wants to walk around knowing they accidentally dissed Bach by saying “I have no idea who wrote this piece and I’ll never know because I can’t find it again on this service.” Again, this is a very new problem. Never had this problem at the local record shoppe.

The CD Model, part 2: It’s time de-aggregate content. Not all genres are searched with same terms and data points. The CD does not translate to the next phase of digital music (streaming and downloading). It is time to stop relying on CDs as the main source of data, as the data does not translate in a uniform way once removed from the Compact Disc itself.

Let’s Get Honest: OK, music industry people, let’s just admit we’ve known about this since the late 1990s and yet as a whole we haven’t fixed it. We’d rather sell mislabeled art than not sell anything at all. Where is the artist or their legacy in all this? Less than a penny per stream (at best) which isn’t paid at all, ever, if the data isn’t correct. Bad metadata leads to unpaid artists. Period.

User Generated Content (UGC): Services that allow users to upload from their own personal libraries and fail to fix faulty data points assist in the proliferation of bad metadata. It can be posited that UGC music companies benefit from faulty data, as it assists in their avoidance of paying out royalties appropriately. There is absolutely no incentive for any UGC streaming company to respect art. Ignore the marketing, ultimately the service wants to sell ad space and use your data to do just that. They do not mind if the song is spelled wrong, or if the featured artist didn’t get their due credit. Why should they?

– Sloppy, Hacky Conventions Are The Standard: Also related to the CD model, antiquated hacky methods of creating and distributing data are now standard, because there is not yet one industry standard for metadata.

The solution for truth in art history is Disruption

Companies like DDEX create solutions for uniform data aggregation, but without aggressive adoption by the industry in its entirety, the problem of bad metadata persists. The solution for classical music and for music of all genres is the aggressive adoption of a standardized music metadata format by the industry as a whole: all of the major labels, independent labels, digital distributors, marketers, managers and the music services building magnificent products enabling art lovers all over the world to listen. Because, after all, isn’t that why we went into this business in the first place? We love music. Let’s show consumers just how much by building and adopting solutions.

Posted by Ted • Wednesday, August 31, 2011 .