RDS (or RDBS in the USA) is a standard for transmitting auxiliary data as a side-carrier of your FM signal – components of this include the Program Service (PS) Name and Radio Text. This commonly includes the station’s name and current song data, but can cover a lot of other auxiliary data services such as traffic information.
Recently, I’ve been developing a Python script to reliably gather this data from my Pira.CZ FM Broadcast Analyser and log it to a database. As an experiment, I’ve also been publishing this to a Twitter feed: @SydneyRDS.
My original attempt at this was in 2011, using the software provided at the time. As I wanted to gather data from multiple stations at short intervals, it was unreliable and often left me with incomplete Radio Text strings. This isn’t great if you want to do anything serious with the data. I turned if off for almost 2 years before I re-visited the concept.
To try and avoid the same issues this time around, I decided to request the data in raw format from the analyser’s serial port, parse it myself, and do some error correction/detection. Simple, eh? It’s actually not that hard. My Python script is only about 180 lines at the moment.
All of the components are avaliable and ready to go:
- Python has a very good serial library, PySerial, which was able to easily communicate with my analyser.
- The analyser’s manual has detailed information on the various serial commands needed.
- I found a RDBS protocol manual online, which turns out is pretty close to the actual RDS standard used outside the USA (read this list of differences for further background).
- In the interests of running this 24/7, I used my Raspberry PI to run the script and upload it to my web server and database.
- Development took place on my Mac and was deployed to the Raspbian flavour of Debian. This should also work on Windows.
As my script is still under testing and development, I’m not going to share the whole thing here. Instead, I’ll share the ideas and concepts behind it as well as a few source code snippets. If you really, really, want the script you can email and beg me for it, but it’s fairly specific to my uses at this stage. If I get a chance, I’ll generalise it and release as open source.
Update 30th December 2015: This script is now available as open source code. Grab the code on GitHub.
Serial Connection and Commands
The first thing to do is connect your analyser / RDS decoder to your computer and try and run some serial commands against it. On a PC, you can use PuTTY to connect. On a Mac, I used Terminal with the screen command. When it came time to build my script in Python, I used this code to connect to the serial port:
ser = Serial(
port = ‘/dev/ttyUSB0’,
baudrate = 19200,
bytesize = EIGHTBITS,
parity = PARITY_NONE,
stopbits = STOPBITS_ONE,
timeout = 0.1,
xonxoff = 0,
rtscts = 0,
interCharTimeout = None
)
Once you’ve connected, you can sent the command <FREQ>*F to tune to your station (<FREQ> is in the format 103200 for 103.2 Mhz). No newline is required to send the command. Don’t expect any return on this one. All the serial commands are in the manual.
RDS Format
Assuming your station is now tuned, you can send the command *R to start receiving RDS Group data. This is the raw data (almost) as received by your analyser. It’s in HEX format. There should be four blocks of four hex characters, for a total of sixteen Hexidecimal characters or 62 binary bits. To stop it, run the command *r.
It’s important to note that the Pira FM Analyser will do the error detection on each block of 4 hex characters for you using the RDS protocol’s checkwords. If it doesn’t check out correctly, it replaces that entire block with four dashes. I discard these.
This diagram (from the RDBS document above) shows the actual binary makeup of the RDS Group Blocks:
What is in each Block? Glad you asked!
As you can see, the PI code is the first 4 hex characters (1 block). A PI is a unique identifier of the station. Every one in your market should be unique. Since these are setup as HEX anyway, there’s no trouble decoding this one.
The second block contains the ID for the data stored in the rest of the Group. It also contains a Traffic Program (TP) signal, Program Type code, and space to store some extra data. We’ll get to this is a second.
The last two blocks contain the actual data. An example of this is the Radio Text data, which carries free-form text such as song details. As you can’t fit much data in 32 bits of data, the last four bits of Block 2 generally contain the starting offset for Block 3 & 4 data.
How to Parse this Data
Using the information provided above, and the highly detailed specs in the RDBS manual, you can fairly easily create a parser to take this HEX data and turn it into something human-readable. Here’s some psudocode Python to show what I did:
What I’ve done here is created an infinite loop. In each iteration of the loop, it reads the Serial buffer and attempts to parse it. Binary “00100” at the start of Block 2 indicates RadioText A. Binary “00000” indicates PS. Using some basic conversion functions, we can parse this and save it to an array for further use.
What about error correction?
The trouble I had originally is this: how do you know if you have a full RDS RadioText message? Unless you have perfect reception and no errors, it’s difficult to tell. What I did was constantly check if there have been changes for the past X seconds. If the answer to that is no, then I assume it’s a full message and carry onto the next station.
There are probably better ways to achieve this, and if so I encourage you to share these in the comments. You can check my experimental Twitter feed to see if there have been any errors in this regard. In the few days I’ve had it running so far, the errors I’ve seen have been minimal.
One thing I also find it’s important to do when changing station is clear the serial buffer. This ensures you have no residual groups stored up.
What next?
I’ve got this running 24/7 on a Raspberry PI so I can start gathering some serious data about songs played on stations here in Sydney. I plan to keep this going and eventually open up the data for analysis. The ideal situation would be a nice website with playlists for all the tracked stations, analysis, trends, etc. Just having this data can be quite valuable. This will go up on TrackTracker.net if/when I get the chance.
Sure, it’s not fool proof and wouldn’t come close to rivalling services such as AirCheck. They use audio fingerprinting and human QA to gather their data. This is a very different approach, and doesn’t rely on the station’s RDS data accuracy.
Another further extension of this would be to employ the fingerprinting services of Echonest. It would be pretty cool to have an Innes Corp Radcap (FM, or even DAB+!), to constantly generate fingerprints and compare them against the Echonest database to detect which songs aired and when. This would theoretically work on any radio station, not just those which output RDS song data. Of course, this would require expensive hardware and some pretty decent smarts. And probably a way to detect audio stretching. That’s probably why AirCheck have so few competitors (this, plus their ‘patented technology’).
I welcome your feedback on all of this. It’s just the start of a experiment. The amount of interest I get will determine how much time I invest into it. Feel free to write in the comments below, or ping me on Twitter.