Sorry, you need to enable JavaScript to visit this website.

Tags/Highlights Format

General terms

Tag- marker placed manually or automatically in the metadata of the recording. Tags can be used to quickly identify moments of interest within these videos. Word 'tag' is often used to describe both point in time and interval generated manually (by pressing the button on either remote, mobile client and/or camera). 

Highlight - interpretation of individual tag, by giving it a length (for example: 5 seconds before & 5 seconds after the tag has been placed). It may, but doesn't have to, contain tag (ie. tag can be outside the highlight interval).

Goals

  • Describe how tags and highlights are defined and how they are stored inside of the MP4 container.
  • Enumerate possible tag/highlight types and provide enough information so that it's clear how tags can be stored/edited/deleted/retrieved from the container.
  • Describe tag structure and types of data associated with it.

Background and strategic fit

Tags/Highlights should be stored together with the video in such a way that, considering platform limitation (primarily SD card access speed), they can be fast retrieved, updated, added and deleted fast. Without achieving this, it won't be possible to manipulate them in a optimal way which might slow down communication with smartphone devices and eventually degrade user experience.

Also, it's important to understand concept of different tags/highlights and the rationale of choosing one tag/highlight format over the other.

Means of accessing/manipulating with tags/highlights on the camera by using it's embedded web server are described in Tag API section.

Tags/Highlights types

Main difference between tags and highlights is that highlights have length assign to them whereas tags are just points in time. Based on the UX design and ideas presented there, Video Application should eventually record both tags and highlights and assume that they can be edited. It is conceived that there will be many types of highlights, some of which can be determined/recorded by the camera and some of which can be calculated later on another platform (like smartphone) and pushed onto the camera and into the MP4 container. The next table contains currently known (requested) highlights and tries to pinpoint which highlights will be generated by the video application and which will be generated later, during post processing. Also, highlight types, it's origin, platform which detects/processes them as well as corresponding intervals are defined.

NOTE: Type column defines enum values for different types of tags/highlights. It's use is described in the paragraph below.

Tag/Highlight Enum Unit Data Type Precision Manual/
Automatic 
Origin Detected by Default Interval
Tag button on camera pressed 1 - - - Manual Camera physical button Camera event system -6sec
Tag button on mobile device pressed 2 - - - Manual Mobile application virtual button Camera event system -6sec
Tag button remote controller 3 - - - Manual Remote controller physical button Camera event system

-6sec

Max. speed (HT_MAX_SPEED) 4 m/s float %.2f Automatic GPS Camera autohighlight algorithm +/- 3sec
Max. heartrate (HT_MAX_HR) 5 bpm int %d Automatic Heart rate sensor (BLE) Camera autohighlight algorithm +/- 3sec
Max. deceleration (HT_MAX_DECEL) 6 m/s^2 float %.2f Automatic GPS Camera autohighlight algorithm +/- 3sec
Max. acceleration (HT_MAX_ACCEL) 7 m/s^2 float %.2f Automatic GPS Camera autohighlight algorithm +/- 3sec
Max. vertical speed (HT_MAX_VERTICAL_SPEED) 8 m/s float %.2f Automatic Barometer Camera autohighlight algorithm +/- 3sec
Max. rotation (HT_MAX_ROTATION) 9 deg/s int %d Automatic Gyro Camera autohighlight algorithm +/- 3sec
Max. G-force (HT_MAX_G_FORCE) 10 mG int %d Automatic Accelerometer Camera autohighlight algorithm +/- 3sec

Tag/Highlight structure

Based on what's stated above and in order to be uniquely identified, tags and highlights should contain following information:

  • UUID - unique identifier which, ideally, should be unique among different tags, videos and cameras
  • Type - integer enum value defining type of tag/highlight
  • Time - offset from beginning of a video (float), marking moment when tag was detected
  • Start offset - beginning of highlight as an offset from beginning of the video (float)
  • End offset - end of highlight as an offset from beginning of the video (float)
  • Value - optional, based on type. Type of data is string and it will be interpreted differently, based on a type. Can be considered as a placeholder which can contain tag metadata (actual max. speed value, name of a person, interesting location, JSON structure...)

Definition of a tag/highlight therefore should be (C/C++ example):

#define TAG_SIZE        1024
// sizeof(TTTag) - <size in  bytes of all the elements until 'value'>
#define VALUE_LENGTH    1024-30     
 
struct TTTag {
    uuid_t uuid;                   
    uint16_t type;
    float time;                    
    float start;
    float end;
    char value[VALUE_LENGTH];
};
 

Having tag/highlight records of fixed length speeds up and eases process of performing CRUD (create, read, update, delete) operations on them. This is particularly important since these records have to be obtained from movie file which is stored on SD card having limited read/write speed. In order to make UX better and reduce response times, tags/highlight records are build and stored in such a way that each record has predefined size (currently set to 1024 but can be changed later), records are stored in contiguous file block and number of tags/highlights that one MP4 container can host is limited to 256 (magic number, can be changed later depending on requirements). As a consequence, it won't be possible to store more than (currently) 256 tags/highlights in one MP4 (every request to add more than that will end up with response code indication error).

Storage

Tags/highlights are stored in the MP4 container, in the video index area. Special atom named 'TTHL' is created and it contains all the records: 

It's important to note that TTHL atom shall have one predefined size (256 * sizeof(Tag)), regardless of whether there are tags/highlights in the video. This allows for tags/highlight to be added/updated/deleted later on, during post-processing.

You are here