Categories
Cycling Ride recording Turbo and training s/w

Merging Strava and Garmin ride files to create complete courses

A while ago, I posted an example of how to edit .gpx and .tcx files when there might have been a small corruption of the ride file preventing it from being uploaded to Strava. See it at  http://www.briansutton.uk/?p=330.

This post concerns the situation when you have been on a long ride, and the battery on your first Garmin Edge (or similar) GPS has run out and you have continued the ride with a mobile phone app (such as the GarminConnect or the Strava app) or another Garmin, and therefore have two, consecutive files that need to be merged to create a single ride file.

In the past I have used the Strava recommended website and tool at http://gotoes.org/strava/Combine_GPX_TCX_FIT_Files.php which is very easy to use and usually DOES create the single file required. It IS necessary to follow the advice in the app to randomise the ID number of the ride (in case either part of the ride has been uploaded to Strava previously – duplicate IDs are not allowed and will prevent the new file from uploading). Once it IS successfully uploaded, make the old files private (on Strava) to avoid doubling up on the relevant distances and elevations! GarminConnect refuses to upload “duplicates”.

By the way, this tool does handle any mix of .gpx, .tcx and .fit files as well; .fit files are digital, not text based (saving a lot of space on the more modern Edge devices that use it as their native file format) but Gotoes (and Strava and GarminConnect) convert them to .gpx or .tcx format for display (your choice when merging) and these formats ARE viewable and editable sensibly in good text editors such as the free one I use, TextWrangler (or its upmarket and chargeable version, BBEdit).

What I have found, however, is that although this Gotoes technique DOES create a file that is readable by Strava as a complete activity, and shows the whole map of the route both for the activity (AND for a course created from it) appropriately on the GarminConnect website, it doesn’t do the whole job when you need to download the course derived from the ride activity to your Garmin for later use.

What happens is that when the second ride recording device is started, although it records latitude and longitude correctly (of course), the cumulative distance parameter starts again from zero (naturally). The effect of this is that when you try to turn the whole, Gotoes merged ride into a Garmin course, although the online GarminConnect shows the course correctly, when downloaded by Garmin Express to an Edge device as a course, the course is truncated at the changeover point where the second Garmin started recording. So the downloaded course on the Garmin device is incomplete, only showing the course from the first part of the ride.

It turned out to be quite a task to put this right, and this post describes the process. It’s a long post, but working all this out and doing the conversions took me most of a day! The author of the Gotoes app, Brian Lucido, has since very kindly and very quickly taken a look at this for me, and his comments and emerging solution are at the end of this article.

Extract from the start of the combined ride .xml file

What follows here is the .tcx file preamble (ride summary), and also the first trackpoint from the combined .tcx file created by Gotoes, but it includes in its preamble section the distance recorded by second (!) device used on a ride when the first ran out of battery life, not the complete distance. I am surprised it used the second Garmin (shorter) distance for this ride (Andratx to Port Pollensa), because in my first attempt at doing this for another ride on another day (Port Pollensa – Sa Calobra loop), it took the FIRST Garmin (longer) distance (!):

<?xml version=”1.0″ encoding=”UTF-8″?>
<TrainingCenterDatabase
xsi:schemaLocation=”http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2 http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd”
xmlns:ns5=”http://www.garmin.com/xmlschemas/ActivityGoals/v1″
xmlns:ns3=”http://www.garmin.com/xmlschemas/ActivityExtension/v2″
xmlns:ns2=”http://www.garmin.com/xmlschemas/UserProfile/v2″
xmlns=”http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2″
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns:ns4=”http://www.garmin.com/xmlschemas/ProfileExtension/v1″>
<Activities>
<Activity Sport=”Biking”>
<Id>2016-05-13T07:14:07Z</Id>
<Lap StartTime=”2016-05-13T07:14:07Z”>
<TotalTimeSeconds>30363.5</TotalTimeSeconds>
<DistanceMeters>20586.637522382</DistanceMeters>
<MaximumSpeed>58.169</MaximumSpeed>
<Calories>3054</Calories>
<AverageHeartRateBpm>
<Value>55</Value>
</AverageHeartRateBpm>
<MaximumHeartRateBpm>
<Value>143</Value>
</MaximumHeartRateBpm>
<Intensity>Array</Intensity>
<Cadence>0</Cadence>
<TriggerMethod>Manual</TriggerMethod>
<Track>
<Trackpoint>
<Time>2016-05-13T07:23:43Z</Time>
<Position>
<LatitudeDegrees>39.57607</LatitudeDegrees>
<LongitudeDegrees>2.42579</LongitudeDegrees>
</Position>
<AltitudeMeters>76.2</AltitudeMeters>
<DistanceMeters>0</DistanceMeters>
<HeartRateBpm>
<Value>88</Value>
</HeartRateBpm>
</Trackpoint>

The first highlighted line is in the file preamble, showing the supposed total length of the ride, 20586.637522382 kms, and the second highlighted line is the correct initial starting distance for the combined ride (zero) i the first trackpoint of the combined ride.

We can see highlighted below that the preamble ride distance in the instance is probably picked up from the second Garmin, because the last trackpoint in the combined file has used that distance (the time, latitude and longitude are correct of course, for the ride termination point!):

Last trackpoint in the Gotoes combined ride file

<Trackpoint>
<Time>2016-05-13T15:55:25Z</Time>
<Position>
<LatitudeDegrees>39.909860</LatitudeDegrees>
<LongitudeDegrees>3.083439</LongitudeDegrees>
</Position>
<AltitudeMeters>7.241</AltitudeMeters>
<DistanceMeters>20576.285</DistanceMeters>
<HeartRateBpm>
<Value>108</Value>
</HeartRateBpm>
</Trackpoint>
</Track>

The problem is that the whole ride distance was 116536.455 kms, as we see below from the last trackpoint in my own, final modified ride file, created using the techniques I explain below. That combined distance, of course, does not appear itself in either Garmin, it has to be calculated, as it does for every other trackpoint in that second part of the ride (by adding on the last trackpoint distance from the first Garmin):

Final, corrected combined ride trackpoint in my reconstructed ride file

<Trackpoint>
<Time>2016-05-13T15:55:25Z</Time>
<Position>
<LatitudeDegrees>39.909860</LatitudeDegrees>
<LongitudeDegrees>3.083439</LongitudeDegrees>
</Position>
<AltitudeMeters>7.241</AltitudeMeters>
<DistanceMeters>116536.455</DistanceMeters>
<HeartRateBpm>
<Value>108</Value>
</HeartRateBpm>
</Trackpoint>

The problem is caused by Gotoes somehow picking up the trackpoint distances recorded by the second Garmin and treating them as valid for the whole ride, whereas they are only valid for the second part of the ride;  ideally the distance would have started at the end distance recorded by the first Garmin device, but of course the second Garmin doesn’t know about the first one, and in effect the <DistanceMeters> has been reset to zero at that point.

The real problem is that when merging the two files, Gotoes should adjust every one of those trackpoint distances for the second Garmin ride file entries, by adding on the final recorded distance of the first one, to each one, but it doesn’t. This is what I have had to do manually using the following techniques (to do it globally rather then editing hundreds of them one by one!).

Maybe the problem would be resolved if all distances were calculated from latitude and longitude data at the time of Gotoes merge, because these ARE correct, but because Garmin adds (the possibly redundant (see my comments at the end of the article)) <DistanceMeters> parameter to every trackpoint, Gotoes seems to use this instead of recalculating distances from latitude and longitude data. Maybe this is why Gotoes merge works for activities up on the Strava and Garmin websites – possibly those websites DO use the latitude and longitude to (re-)calculate and display ride statistics and maps, and for online course display; but not, apparently, when downloaded to a Garmin Edge device. Maybe the Garmin Edge device doesn’t have that function built in for recalculating downloaded courses, and just uses the <DistanceMeters> parameter when doing that.

In summary, this doesn’t seem to be a problem for the activity itself on GarminConnect and Strava, but it does cause an issue for the Course created from the activity once downloaded to a Garmin Edge; hence the need to edit the Gotoes file.

XML file editing process

This took me half a day to work out, but then only 1/2 hour to implement for each of two such rides! I’ll just describe the essentials, and can expand further for anyone who needs help!

The issue is that any text editor, whether a really good one like TextWrangler or BBEdit, or a basic one, doesn’t do arithmetic! In the file from which extracts were taken above, there were hundreds of trackpoints in error (the second part of the ride, on the second Garmin) as far as distance is concerned. This was far too many to adjust manually (by adding the closing distance parameter from the first Garmin (which was 95950 kms) to every trackpoint distance from the second Garmin).

I should say here (for the “techies”) that I first thought of using the “regular expression” editing option (called “grep” in the web editing trade) in TextWrangler to isolate the <DistanceMeters>123.456</DistanceMeters> distance parameters, which is very easy (123.456 is obviously going to be different for every trackpoint for the ride). But then there is no way to edit or paste in (globally in one operation) each different trackpoint distance within TextWrangler (varying distances here represented by 123.456) by adding the constant 95950 to each original one.

One can find and select the trackpoints with their different distance data globally using grep wildcard characters (I used the search term:

<DistanceMeters>.*\..*</DistanceMeters> and it works for any amount of different numbers either side of the decimal point in the distance numbers)

but there isn’t a way to do a selective, global replacement with the (different) correct distances for each trackpoint.

What I found was necessary is to import the relevant thousands of lines (“only” hundreds of distance lines but each trackpoint has a dozen or so other lines of data) of the .tcx file into Excel (where you CAN do arithmetic!) as a (html) file delimited by certain characters;  isolating the distance figure in each <DistanceMeters> line into a separate Excel column; add on the constant starting distance 95950 kms to each distance number entry in that Excel column; and then save the file with the modified distances as an Excel text .txt file. This file is then opened by TextWrangler, where, if it is done the right way (as explained below), the original file format can easily be recreated and the data (the whole new file, in essence) substituted into the second part of the original, merged ride file (i.e. the part from the second Garmin whose <DistanceMeters> data had started at zero instead of 95950 in this case) to create a ride file properly representing the whole ride and its complete distance.

More TextWranger XML editing, Excel import, Excel editing and final TextWrangler XML editing details

What took so long to work out?! Here’s the a) to r) of the process:

a) the import of the second Garmin file into Excel (by default an Excel .txt file) requires 1) only importing the trackpoint items (hundreds of them) and 2) not the preamble (easy);

b) work out a way of specifying the delimiting characters such that the distance data is isolated into its own column in Excel when imported (more difficult). This is the most important and crucial step, it turns out. What I had to do (there might be a better way) was replace the surrounding <DistanceMeters> and </DistanceMeters> tags with xx and xx/ respectively (the / is possibly redundant, but I wanted to be able to edit back in the original tags which are different for opening and closing tags). “xx” is unique in the file, there being no other instances;

c)  add on the starting distance (95950 for starting the the second part of the ride, not zero! i.e. the closing distance for the first part) to each line where the distance parameter occurs (in Excel’s column C, say); this is best done by doing a conditional add into another (intermediate temporary) empty column (e.g. using the Excel formula if(C1<>””,C1 + $G$1,””) defined for row 1 in the temporary column, and then pasted into every row cell in that temporary column in the spreadsheet, where the constant add-on distance 95950 has been stored in cell G1, say) where we only add on the distance 95950 where the existing distance column has a number in it (i.e. isn’t blank). Then this new column is “pasted special” into the original distance column, choosing the option values only (since we want nothing but text or number data in the new file, NOT formulae). The intermediate, temporary column is then deleted;

d) add a new column either side of the new, replaced distance data column, ready to receive some new tags (i.e. <DistanceMeters> tags instead of the unique xx used purely to isolate the distances for the Excel import);

e) similarly, as for the distance data, conditionally (e.g. using if(C1<>””,<DistanceMeters>,””) to add this opening tag text to another column to recreate the original style opening tags, only in rows where the cell in the distance data column isn’t blank;

f) similarly, conditionally (e.g. using if(C1<>””,</DistanceMeters>,””) to add this closing tag text to yet another column to recreate the original style closing tags, only in rows where the cell in the distance data column isn’t blank;

g) again using paste special, paste the values only from these two intermediate columns into the new columns created either side of the actual distance data column;

h) delete the two intermediate columns;

i) save the new Excel .txt file and open it in TextWrangler (which allows opening any text based file, including Excel delimited ones (just like .csv files, although this is a .txt file in Excel), with those new  <DistanceMeters> and </DistanceMeters> columns we created in Excel, effectively undoing the “xx” tag changes we made as part of step b);

j) globally edit out any spaces left between the distance data items and the new <DistanceMeters> and </DistanceMeters> tags introduced in Excel; there are some left in the the Excel .txt file when it opens in TextWrangler. This is two very simple standard global edits, e.g replacing:

“<DistanceMeters>   ” with “<DistanceMeters>” (using “Replace all”) and similarly for;

”    </DistanceMeters>” with “</DistanceMeters>”;

k) copy and paste the whole file of new trackpoint items (no preamble or “postamble”, remember, trackpoints only) into the second part of the original Gotoes merged file to replace the old, second Garmin data from the changeover point;

l) The summary ride length parameter in the Gotoes merged .tcx file preamble was actually, in another file I had to alter, the longer of the two ride parts, and in this one it was the shorter one! Naturally I changed that to the total length in each one, a simple single edit each time.

m) run the add-on XML formatting option in TexWrangler (the Text Filter “XML Tidy” plugin which I had added to my TextWrangler before) over the whole file to achieve the right indentation (for appearance and accessibility only, with no change of content); I use this because sometimes when using TextWrangler to import xml data (as with Gotoes) the xml data is inserted as an extremely long string of data. Soft wrap doesn’t sort this, but there are reformatting tools like XMLTidy, JSONTidy and PHPTidy (and others) that don’t change content but re-present it in the accepted format for such file structures.

n) delete all versions of the original two parts of the ride activity in GarminConnect, and any website incorrect courses derived from them (Garmin won’t upload files it thinks are duplicates);

o) upload the new file to GarminConnect as an activity;

p) turn the activity into a course using the standard GarminConnect website dialog;

q) download the new course into the Garmin Edge of your choice (both of them in my case!)

r) check that the whole course is in the map option in the Garmin Edge – at last it is!

Appendix: Even a little more on Step b)!

In order to get the Excel import of a tab or character delimited file to isolate a numeric item in any line, we have to find a unique character that opens and closes that data item in TextWrangler, in any line in which it appears, before importing into Excel. Each trackpoint entry in the .tcx file has a dozen or so lines as we see above from the extracts, and only one of them for each trackpoint has the distance data for each (of the hundreds) of the trackpoints.

An issue is that the more obvious and usual delimiters from the xml tagging, “<” and “>”,  appear for all other data items too, so using them, or any other alphabetic single letter (Excel import only allows one delimiter character!) one might choose in <DistanceMeters> or </DistanceMeters> is no good for isolating distance data, as all of the letters in them happen to be used in other lines of each trackpoint. The delimiter character needs to be unique to the distance data item line in order to isolate the imported distance data into its own Excel column.

I first thought of changing <DistanceMeters> to “x” all through the .tcx file, but in the end decided to use “xx” for the opening and “xx/” for the closing tag each time – maybe unnecessary but I wanted to be super-safe there were no other single “x”s, or that the single character delimiter “x” in the Excel import dialog didn’t cause confusion when it was used for both opening and closing, and I added the “/” (to distinguish opening and closing delimiters) which in the end I just deleted in the excel file as it turned out not to be necessary. I used double x “xx” in case that would create a column either side of the distance data, but it didn’t! But it DID isolate the distance numbers themselves into their own column which was the object of the exercise.

The three successive Excel import process dialogs thus looked like:

Excel import dialog #1
Excel import dialog #1

 

Excel import dialog #2
Excel import dialog #2

 

Excel import dialog #3
Excel import dialog #3

PS I should say that I reported this to Brian Lucido (the Gotoes developer) at Gotoes.org on their website at 00.27 Mar 25th UK time, and he responded at 00.35! Mightily impressive in the early hours of Sunday UK time (I think they are 7 hours behind, but it still Saturday evening for them, when they should be at the pub!) He is looking at it now, defining the issue as the distance resetting as each new part of the ride is added to the combined ride file. I’m off to bed now while they work on that! I’m losing an hour, as we all are tonight! Need my beauty sleep at my age!

I wondered why the distance doesn’t get calculated from the latitude and longitude positions, or rather, why is it reported in the ride file at all? It would seem to me that there is no point in adding redundant data (from a good database design standpoint, this seems contrary to 2nd Normal Form for a database – see https://en.wikipedia.org/wiki/Database_normalization) such as the calculated distance in those tags. It only helps if someone is reading the file; any training software that is reporting on the ride could do it from the latitude and longitude, and then there is no issue about multiple rides because the latitude and longitude are correct in both parts of the ride. I suppose that is a question for Garmin whose device operating software puts it in there! Brian Lucido said that it is (only) needed for indoor training activities where thee is no GPS data available.
To be honest I haven’t double checked the times and calories etc; I assume the time of day is good in both parts of the ride, the heart rate is instantaneous as well, but I’m not sure how calories gets into the summary.

Brian came up with a solution after I suggested the point about data redundancy – and just removed the <DistanceMeters> tag!  It’s the easiest thing to do, and he agreed it was redundant.  H looked into what it would take to fix the issue (problem was a private variable – so it gets reset to zero each time the subroutine is executed)… and decided that it would be easier just to comment out the <DistanceMeters> part.  He realized that his app prohibits files lacking lat/long in any case… <distancemeters> is only there for swimmers and people on trainers and treadmills who don’t have any lat/long data.  Those types of activities don’t really need a merge tool!

But the summary data for <DistanceMeters> is incorrect in the preamble part of the file too, so that would need to be corrected in his app,  Apparently, while he was looking, he might have seen a circumstantial bug with the power/cadence… and is going to fix that. More to come…