Workflow guide
Introduction
This comprehensive guide walks you through the complete workflow of using GEM, from initial setup to downloading your matching results.
Prerequisites
Before starting with GEM, ensure you have:
1. Access Requirements
- Console Access: Active account at my.tomtom.com
- Project Assignment: Assigned to a project with GEM access
- Authentication: Microsoft Entra ID credentials configured
2. Technical Setup
- Azure CLI Installed: Download from Microsoft
- Terminal Access: Command-line terminal on your machine
- Storage Space: Adequate local disk space for your data files
3. Data Preparation
- Data Format: Files in Apache Parquet format (
.parquetextension) - Required Fields: Your data must include:
id(integer): Unique identifieris_navigable(boolean): Navigability flaggeometry(LineString WKT): Road geometry
Example data record:
1{2 "id": 5707295,3 "is_navigable": true,4 "geometry": "LINESTRING (145.18156715700002 -37.87340530899996, 145.1809221540001 -37.87356512499997)"5}
4. Knowledge Prerequisites
- Understanding of your source data structure
- Familiarity with data licensing requirements
- Basic command-line operations
Complete workflow
Step 1: Access GEM dashboard
- Navigate to Dashboard
- Log in using your Microsoft Entra ID credentials
- In the left navigation pane, select the appropriate Project from the dropdown menu
- Click on Global Entity Matcher in the sidebar
What you'll see:
- List of previous matching jobs (if any)
- Prepare Data + button for uploading new data
- Trigger matching button to start new jobs
- Search and filter capabilities for job history
Step 2: Prepare and upload data
2.1 Initiate upload process
- Click the Prepare Data + button on the dashboard
- The "Upload Data" modal window will appear
2.2 Select storage
In the "Select Storage" step:
-
Storage Name: Select your target storage from the dropdown
- If you have access to only one storage, it will be pre-selected
- Multiple storages: choose the appropriate one for your project
-
Click Next to proceed
Important Notes:
- Storage access is managed through role-based access control
- Contact your administrator if no storage appears
- Storage must be configured before first use
2.3 Authorize storage access
The authorization step provides credentials for Azure CLI access.
Option A: Automatic Credential Integration (Recommended)
- Click the Unwrap button when prompted
- Review the security warning about displaying sensitive credentials
- Click Unwrap again to confirm
- The command will auto-populate with your credentials
- Copy the complete
az logincommand - Open your terminal and execute the command
Option B: Manual Credential Entry
- Copy the
az logincommand template from the UI - Paste it into your terminal
- Replace
<client_id>and<client_secret>with your actual credentials - Replace
<tenant_id>with your tenant identifier - Execute the command
Command example:
1az login --service-principal \2 --username <client_id> \3 --password <client_secret> \4 --tenant <tenant_id>
Security Best Practices:
- Never share credentials with unauthorized parties
- Credentials are temporary and scoped to specific operations
- Tokens expire after a set period
- Re-authenticate if credentials expire
2.4 Upload your data file
1. In the Enter local file path field, type or paste the full path to your Parquet file
- Windows Example:
C:\Users\username\data\my_map_data.parquet - macOS/Linux Example:
/Users/username/data/my_map_data.parquet
2. The UI will automatically extract the filename and update the upload command
3. Copy the generated az storage blob upload command
4. Execute the command in your terminal:
1az storage blob upload \2 --account-name <storage-account> \3 --container-name <container> \4 --name your_data.parquet \5 --file /path/to/local/your_data.parquet \6 --auth-mode login
5. Wait for upload completion - Progress will display in your terminal
6. Click Finish to close the modal
Upload Tips:
- Larger files take longer - be patient
- Azure CLI supports files of any size
- Ensure stable network connection
- Keep the filename simple (no special characters)
- Verify upload success before proceeding
Step 3: Trigger matching job
3.1 Open matching form
- Return to the GEM dashboard
- Click the Trigger matching button
- The "Run GEM Matching" form will appear
3.2 Complete the form
Fill in all required fields:
| Field | Description | Example |
|---|---|---|
| Input file name | Exact filename from upload step (including .parquet extension) | my_map_data.parquet |
| Storage Name | Same storage used for upload | Select from dropdown |
| Matching Type | Algorithm to use (currently: Road Matching only) | Road Matching |
| Overture Release | Reference map version (auto-populated) | 2024-09-24.0 |
3.3 Submit the job
- Review all entries for accuracy
- Ensure the filename matches exactly (case-sensitive)
- Verify you selected the correct storage
- Click Submit
Form Validation:
- Empty fields will trigger validation errors
- Incorrect filename will cause job failure
- Storage mismatch will result in file not found error
3.4 Job submission confirmation
Upon successful submission:
- The form closes automatically
- A new entry appears in the job list
- Initial status shows as In Progress
- Job ID is generated for tracking
If submission fails:
- Error message appears in the modal
- Common causes:
- File not found in specified storage
- Invalid file format
- GEM service temporarily unavailable
- Authorization issues
- Review error message and correct the issue
Step 4: Monitor job progress
4.1 Job status dashboard
The main dashboard displays all your matching jobs with real-time status updates.
Job Status Types:
| Status | Icon | Description | Typical Duration |
|---|---|---|---|
| In Progress | 🔄 | Job is actively processing | Varies by data size (~100K roads/hour) |
| Success | ✅ | Matching completed successfully | N/A - Ready for download |
| Failed | ❌ | Job encountered an error | N/A - Requires investigation |
4.2 Using dashboard features
Search by Job ID:
- Use the search bar to find specific jobs
- Enter partial or complete job ID
- Results filter in real-time
Filter Jobs:
- Filter by status (In Progress, Success, Failed)
- Filter by storage
- Filter by Overture release version
- Filter by matching type
Sort Options:
- Sort by submission date
- Sort by completion date
- Sort by job name
- Sort by status
4.3 Refresh status
- The dashboard auto-refreshes periodically
- Manual refresh available via browser refresh
- Click on a job row to view detailed information
Monitoring Tips:
- Processing time varies based on data size
- Average: ~100,000 road segments per hour
- Small datasets (< 10K roads): Minutes
- Large datasets (> 1M roads): Hours
- No email notifications yet (planned feature)
Step 5: View job details
5.1 Access details page
- Locate your job in the dashboard list
- Click the details arrow (→) at the end of the job row
- The Job Run Details page opens
5.2 Details page overview
The details page contains two main sections:
Job Run Details Section:
- Job ID (unique identifier)
- Input filename used
- Storage location
- Matching type applied
- Overture release version
- Submission timestamp
- Completion timestamp (if finished)
- Job status
Download Results Section (appears only for successful jobs):
- Results filename
- Download instructions
- Azure CLI commands for downloading
5.3 Interpreting results
For successful jobs, review the matching statistics:
- Roads Matched: Percentage successfully matched to GERS IDs
- Roads Unmatched: Percentage without matches
- Roads Fully Matched: Complete single GERS ID assignments
- Roads Partially Matched: Multiple potential matches
- Confidence Threshold: Minimum score applied (typically >60%)
Quality Indicators:
-
85% matched: Excellent quality
- 70-85% matched: Good quality, review unmatched roads
- Less than 70% matched: May indicate data quality issues
Step 6: Download results
6.1 Initiate download
For jobs with Success status:
- On the Job Details page, locate the Download Results section
- Click the Download button
- The "Download Data" modal appears
6.2 Authorize storage (if needed)
If you're already authorized from the upload step, skip to 6.3.
If authorization expired or this is a new session:
- Follow the same authorization process as Step 2.3
- Unwrap credentials and execute
az logincommand - Proceed once authenticated
6.3 Specify download location
-
In the Local destination directory path field, enter where you want results saved
- Windows Example:
C:\Users\username\downloads\gem_results - macOS/Linux Example:
/Users/username/downloads/gem_results
- Windows Example:
-
The system updates the download command with:
- Storage account name
- Container name
- Results filename
- Your specified destination
-
Copy the complete
az storage blob downloadcommand
6.4 Execute download
Run the command in your terminal:
1az storage blob download \2 --account-name <storage-account> \3 --container-name <container> \4 --name predictions.parquet \5 --file /path/to/destination/predictions.parquet \6 --auth-mode login
Download Process:
- Progress displays in terminal
- Download time depends on results file size
- Verify download completes successfully
- Click Finish to close the modal
6.5 Verify downloaded results
After download completes:
- Navigate to your specified destination directory
- Confirm the
predictions.parquetfile exists - Check file size is reasonable (not 0 bytes)
- Open file in Parquet viewer or analytical tool
- Verify data structure and content
Working with multiple jobs
Running parallel jobs
You can submit multiple jobs simultaneously:
- Each job processes independently
- No limit on concurrent jobs (subject to system capacity)
- Track all jobs from the main dashboard
- Download results as each job completes
Organizing your work
Best Practices:
- Use descriptive filenames for easy identification
- Keep track of which data corresponds to which job
- Download results promptly after completion
- Maintain local backup of input data
- Document matching parameters used
Job history management
- All jobs remain in your history
- Search by job ID for quick access
- Filter to find specific job types
- Review past results for comparison
- No automatic deletion of job records
Troubleshooting common issues
Authentication problems
Issue: az login fails
Solutions:
- Verify credentials are copied correctly (no extra spaces)
- Check credentials haven't expired
- Confirm Client ID, Secret, and Tenant ID are correct
- Try unwrapping credentials again from UI
- Contact administrator if credentials are invalid
Upload failures
Issue: File upload fails or times out
Solutions:
- Check internet connection stability
- Verify file path is correct and file exists
- Ensure sufficient storage permissions (Full Access role)
- Try smaller file for testing
- Check Azure CLI is installed correctly:
az --version
Job submission errors
Issue: Cannot submit matching job
Solutions:
- Confirm filename matches exactly (case-sensitive, including extension)
- Verify file was successfully uploaded to storage
- Ensure file is valid Parquet format
- Check required fields (id, is_navigable, geometry) exist
- Check system status or contact support
Job failures
Issue: Job status shows "Failed"
Solutions:
- Review detailed error logs in job details page
- Verify input data meets format requirements
- Check data quality (valid geometries, complete records)
- Ensure no corrupted records in Parquet file
- Contact support with job ID for investigation
Download issues
Issue: Cannot download results
Solutions:
- Re-authenticate if credentials expired
- Verify destination directory exists and is writable
- Check disk space availability
- Ensure correct storage permissions (Read Access or Full Access)
- Try different destination path
Tips for optimal results
Data quality
- Clean geometries (valid WKT LineStrings)
- Complete records (no null required fields)
- Unique IDs for each road segment
- Accurate
is_navigableflags - Proper coordinate systems
Performance optimization
- Process data in reasonable batches
- Use fast, stable internet connection
- Upload during off-peak hours for large files
- Monitor job progress regularly
- Download results promptly after completion
Matching quality
- High-quality input data → higher match rates
- Recent data → better alignment with Overture
- Complete network coverage → fewer gaps
- Review unmatched roads for patterns
- Iterate and improve data quality
Getting help
Self-service resources
- In-App Help: Tooltips and inline guidance in GEM UI
- Documentation: This guide and related documentation
- FAQ: Common questions and answers
- Technical Docs: System architecture and integration details
Support channels
For Issues or Questions:
- Review job error logs in the detailed view
- Check prerequisites are met
- Consult troubleshooting section above
- Contact support team with:
- Job ID (if applicable)
- Error messages
- Steps to reproduce
- Screenshots if helpful
Support Contact:
- Support portal
- Email: [Contact through support]
- Include: Job IDs, timestamps, error details
Next steps
After successfully matching your data:
- Analyze Results: Review matching statistics and confidence scores
- Validate Quality: Spot-check matched GERS IDs against your data
- Integrate: Use GERS IDs in your applications and workflows
- Iterate: Refine input data based on matching results
- Scale: Process additional datasets as needed
Additional resources
- Understanding GERS IDs - Learn about the reference system
- Use Cases - Explore practical applications
- Features & Benefits - Understand GEM capabilities
- Technical Documentation - Deep dive into system architecture
- FAQ - Find answers to common questions
Workflow checklist
Use this checklist to ensure you complete all steps:
- Prerequisites verified (access, tools, data format)
- Logged into dashboard and accessed GEM dashboard
- Selected appropriate project
- Prepared data in Parquet format with required fields
- Uploaded data via Prepare Data workflow
- Authorized storage access with Azure CLI
- Verified file upload success
- Triggered matching job with correct parameters
- Monitored job status to completion
- Reviewed job details and matching statistics
- Downloaded results successfully
- Verified downloaded file integrity
- Documented job details for records
Workflow Complete! You're now ready to use your matched data with GERS IDs.