Getting Started with GEM
Introduction
Global Entity Matcher (GEM) is designed to simplify the process of aligning your geospatial data with standardized identifiers. This guide will help you understand how to access and begin using GEM for your organization.
Overview
The Global Entity Matcher (GEM) enables you to match your proprietary transportation network data against Overture Maps reference datasets. GEM provides matching results with confidence scores, helping you assess the accuracy and reliability of the alignment between your data and reference map elements.
Prerequisites
Before getting started with GEM, ensure you have:
1. Console Access
- Dashboard Account: Active access to my.tomtom.com
- Microsoft Entra ID: Authentication credentials configured
- Project Assignment: Assigned to a project with GEM access enabled
2. Technical Tools
- Azure CLI: Required for data upload/download operations
- Install from Microsoft Azure CLI
- Azure CLI is the recommended method as it supports files of any size
- Alternative upload methods may be restricted by system memory and network limitations
- Terminal Access: Command-line interface on your machine
- Stable Internet: For uploading/downloading large datasets
3. Data Requirements
- Format: Input files must be in Apache Parquet format with
.parquetextension- Files in other formats can be uploaded to storage but will not trigger the matching process
- Required Fields: Each record must contain:
- id: Unique identifier (integer)
- is_navigable: Boolean flag indicating if the road is navigable
- geometry: LineString in WKT (Well-Known Text) format
Example record structure:
1{2 "id": 5707295,3 "is_navigable": true,4 "geometry": "LINESTRING (145.18156715700002 -37.87340530899996, 145.1809221540001 -37.87356512499997)"5}
4. Knowledge Requirements
- Understanding of your source data structure and format
- Familiarity with data licensing requirements and restrictions
- Basic command-line operations
- Understanding of geospatial data concepts
How to access GEM
GEM is available through the Dashboard platform.
Access steps:
-
Navigate to Dashboard
- Go to my.tomtom.com
-
Authenticate
- Log in using Microsoft Entra ID (Azure AD) credentials
- Authentication is required to access the GEM interface
-
Select Project
- In the left navigation pane, select the appropriate project from the dropdown menu
- Only projects with GEM access will show the Global Entity Matcher option
-
Access GEM Dashboard
- Click Global Entity Matcher in the sidebar
- If this option is not visible, your organization or project may not be onboarded yet
Access requirements
Note: If your organization or project is not supported by GEM UI, the "Global Entity Matcher" option will not appear in the sidebar. Contact your system administrator or support team to request access.
GEM UI capabilities
The GEM User Interface provides comprehensive features for data matching:
Key features:
Request Management
- View and track all previous matching requests
- Search jobs by ID
- Filter by status, storage, Overture release, and matching type
- Access detailed information for each job run
Data Preparation
- Retrieve storage credentials securely
- Upload data using Azure CLI with step-by-step guidance
- Support for large files (no size limit with Azure CLI)
Job Submission
- Trigger matching requests with customizable parameters
- Select input file, storage, matching type, and Overture release
- Real-time validation of form inputs
Results Visualization
- Monitor matching job status and progress
- View detailed matching statistics
- Track confidence scores and match quality metrics
Download Results
- Retrieve matching results via Azure CLI
- Secure credential management
- Guided download process
How to access GEM
GEM is currently available through direct engagement with our sales team. To get started:
- Contact Sales: Reach out to the sales team to discuss your specific needs
- Define Your Use Case: Share your requirements, data types, and business objectives
- Onboarding: Work with our team to configure GEM for your specific datasets and workflows
Quick start guide
Follow these steps for your first matching job:
Step 1: Install Azure CLI
If not already installed, download and install Azure CLI:
macOS:
brew install azure-cli
Windows: Download from Microsoft Azure CLI
Linux:
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
Verify installation:
az --version
Step 2: Access GEM dashboard
- Go to my.tomtom.com/gem
- Log in with your Microsoft Entra ID credentials
- Select your project from the dropdown
- You should see the GEM main page with:
- Brief description of GEM
- Table of previous matching runs (empty for first-time users)
- Prepare Data button
- Trigger matching button
Step 3: Upload your data
- Click the Prepare Data + button
- Select your storage from the dropdown
- Follow the authorization steps:
- Click Unwrap to reveal credentials
- Copy and run the
az logincommand in your terminal
- Enter your local file path
- Copy and run the
az storage blob uploadcommand - Wait for upload completion
- Click Finish
Step 4: Trigger matching
- Click the Trigger matching button
- Fill in the form:
- Input file name: Your uploaded filename (e.g.,
my_data.parquet) - Storage Name: Select the storage you used
- Matching Type: Road Matching (currently the only option)
- Overture Release: Auto-populated (e.g., 2024-09-24.0)
- Input file name: Your uploaded filename (e.g.,
- Click Submit
- Your job appears in the list with "In Progress" status
Step 5: Monitor and download
- Refresh the dashboard to check job status
- When status changes to "Success", click the details arrow (→)
- Review the matching statistics
- Click Download in the Download Results section
- Authorize storage access (if needed)
- Specify local destination directory
- Copy and run the
az storage blob downloadcommand - Your results are now available locally
Processing Time: Approximately 100,000 road segments per hour. Small datasets may complete in minutes, while larger datasets may take several hours.
Understanding your data
To get the most out of GEM, it's important to understand:
- Data Format: The structure and format of your geospatial datasets
- Data Quality: Current quality and accuracy of your data
- Update Frequency: How often your data changes and needs to be synchronized
- Coverage Area: Geographic regions covered by your datasets
System status and performance
Current status
GEM is fully operational and deployed in production:
System Metrics:
- Uptime: ≥99% (monitored continuously)
- Processing Speed: ~100,000 road segments matched per hour
- Matching Accuracy: >85% confidence scores for high-quality input data
- System Availability: Deployed on production cluster with Helm
- Security: No critical vulnerabilities - Regular security scanning active
Infrastructure:
- Database: Cloud database configured and operational
- Storage: Azure Blob Storage integration active
- Authorization: Role-based access control enforced
- Authentication: Microsoft Entra ID integrated
Performance monitoring
System performance and health is monitored continuously to ensure high availability and reliability.
Integration approach
GEM supports an iterative approach to integration:
- Initial Assessment: Analyze your current datasets and conflation challenges
- Pilot Implementation: Start with a representative subset of data to validate the approach
- Quality Review: Evaluate matching results and confidence scores
- Gradual Rollout: Expand coverage based on business priorities and results
- Continuous Optimization: Refine data quality and processes based on feedback
Recommended workflow
Phase 1: Pilot (2-4 weeks)
- Select small, representative dataset
- Run initial matching job
- Validate results quality
- Identify data quality issues
- Establish success criteria
Phase 2: Expansion (4-8 weeks)
- Process larger datasets
- Implement in production workflows
- Monitor performance and accuracy
- Gather user feedback
- Document best practices
Phase 3: Production (Ongoing)
- Regular data updates and matching
- Continuous quality monitoring
- Integration with downstream systems
- Periodic reviews and optimization
Next steps
Once you have access to GEM and completed your first match:
Immediate actions
-
Review Results: Analyze matching statistics and confidence scores
- Check roads_matched percentage
- Review roads_fully_matched vs roads_partially_matched
- Identify unmatched roads for investigation
-
Validate Quality: Spot-check a sample of matched GERS IDs
- Compare with your source data
- Verify geometry alignment
- Check confidence scores
-
Document Findings: Record your observations
- Note data quality issues discovered
- Document successful matching patterns
- Identify areas for improvement
Ongoing activities
-
Improve Data Quality: Based on matching results
- Fix geometry errors
- Complete missing fields
- Validate navigability flags
- Ensure unique IDs
-
Process Additional Data: Expand your matching coverage
- Upload remaining datasets
- Match different geographic regions
- Process historical data for comparison
-
Integrate GERS IDs: Use matched identifiers in your applications
- Update databases with GERS IDs
- Modify data pipelines
- Enable interoperability with other systems
-
Monitor Performance: Track your matching jobs
- Review processing times
- Monitor match rates over time
- Identify optimization opportunities
Support and documentation
Available resources
Self-Service Documentation:
- Workflow Guide - Step-by-step instructions for complete workflow
- Technical Documentation - System architecture and integration details
- Features & Benefits - Explore GEM capabilities
- Use Cases - Industry-specific applications
- Understanding GERS IDs - Learn about the technology
- FAQ - Common questions and answers
Technical Support:
- Support: Contact through support portal
- Error Logs: Available in job details page
Training Resources:
- Inline guidance and tooltips in GEM UI
- Azure CLI documentation from Microsoft
- Azure Blob Storage guides
Getting help
When contacting support, provide:
- Job ID (if applicable)
- Error messages (exact text)
- Steps to reproduce the issue
- Screenshots (if helpful)
- Data sample (if related to data quality)
Typical Response Times:
- Critical issues: Contact support immediately
- General questions: 1-2 business days
- Feature requests: Reviewed quarterly
Typical timeline
The time to get started with GEM varies based on your setup:
| Phase | Duration | Activities |
|---|---|---|
| Initial Setup | 1-2 days | Install Azure CLI, verify access, prepare first dataset |
| First Match | 1 day | Upload data, trigger job, download results |
| Pilot Evaluation | 1-2 weeks | Test with representative data, validate results |
| Process Refinement | 2-4 weeks | Improve data quality, optimize workflow |
| Production Deployment | Ongoing | Regular matching jobs, integration with systems |
Factors Affecting Timeline:
- Data quality and preparation time
- Dataset size and complexity
- Internal approval processes
- Integration requirements
- Team availability and experience
Best practices for getting started
Data preparation
- Start with clean, well-structured data
- Validate Parquet files before upload
- Ensure all required fields are present
- Use descriptive filenames
- Test with small dataset first
Job management
- Document each job's purpose and parameters
- Use consistent naming conventions
- Download results promptly after completion
- Keep local backups of input and output data
- Track job IDs for reference
Quality assurance
- Review matching statistics for each job
- Investigate unmatched or low-confidence matches
- Compare results across different data versions
- Validate random samples manually
- Monitor trends over time
Security
- Never share credentials with unauthorized parties
- Use credentials only in secure terminals
- Don't store credentials in scripts or code
- Re-authenticate when credentials expire
- Report any security concerns immediately
Contact information
Ready to get started? Contact our team:
- Dashboard Portal: my.tomtom.com
- Product Page: GEM
- Support: Available through support platform
- Sales: Contact for enterprise or on-premises deployments
Additional resources
- Workflow Guide - Detailed step-by-step instructions
- Features & Benefits - Explore what GEM can do for you
- Use Cases - See how others are using GEM
- Understanding GERS IDs - Learn about the technology behind GEM
- Technical Documentation - System architecture and integration details
- FAQ - Common questions and answers
- Azure CLI Documentation - Microsoft's official guide
- Azure Blob Storage - Storage documentation