DB Table

AI Data Profiler

Learn how to use Google Gemini AI to analyze your data and get intelligent suggestions for optimal column types and formatting.

AI Data Profiler

The AI Data Profiler is one of DB Table's most powerful features, using Google's Gemini AI to analyze your data and suggest optimal column types. This intelligent system can identify patterns that automatic detection might miss and provide recommendations for better data formatting.

What is AI Data Profiling?

Overview

AI Data Profiling uses machine learning to:

  • Analyze data patterns beyond simple rule-based detection
  • Understand context from column names and data relationships
  • Suggest improvements to automatic type detection
  • Identify edge cases that traditional algorithms miss
  • Provide confidence scores for each recommendation

How It Works

  1. Data Sampling - Sends the first 5 rows of each column to AI
  2. Context Analysis - AI examines column names, data patterns, and relationships
  3. Pattern Recognition - Identifies complex patterns using machine learning
  4. Type Suggestion - Recommends optimal column types with reasoning
  5. Confidence Scoring - Provides reliability indicators for each suggestion

Privacy Note: Only a small sample (first 5 rows) of your data is sent to Google's Gemini AI service. The rest of your data remains on your device.

When to Use AI Profiling

Ideal Scenarios

Mixed or Complex Data

  • Columns with inconsistent formatting
  • Data that doesn't fit standard patterns
  • Multiple data types in the same column

Ambiguous Column Names

  • Generic headers like "Value" or "Data"
  • Abbreviated or unclear column names
  • Foreign language column headers

Large Datasets

  • When manual review would be time-consuming
  • Multiple sheets with similar but not identical structures
  • Bulk data processing workflows

Quality Assurance

  • Verifying automatic detection results
  • Second opinion on critical data
  • Standardizing data across multiple files

When Manual Detection Might Be Better

  • Very small datasets (< 10 rows)
  • Highly sensitive data that shouldn't leave your device
  • Custom data types not supported by standard categories
  • Offline environments without internet access

Using the AI Profiler

Step-by-Step Process

  1. Upload and view your data - Ensure your file is loaded and visible
  2. Click "AI Data Profiler" - Located in the data table toolbar
  3. Wait for analysis - AI processes your data (typically 3-10 seconds)
  4. Review suggestions - Examine AI recommendations in the results dialog
  5. Apply changes - Choose to apply all suggestions or review individually

Understanding the Interface

Analysis Button

  • Located in the main toolbar
  • Shows loading state during processing
  • Disabled if no data is loaded

Results Dialog

  • Shows all column suggestions
  • Displays confidence indicators
  • Provides reasoning for each suggestion
  • Offers bulk apply or individual selection options

Progress Indicators

  • Loading spinner during analysis
  • Success/error states
  • Processing time estimates

Understanding AI Suggestions

Suggestion Format

Each AI recommendation includes:

Column: "purchase_date"
Current Type: String
Suggested Type: Date
Confidence: High
Reasoning: "Contains date patterns in MM/DD/YYYY format"

Confidence Levels

High Confidence 🟢

  • AI is very certain about the suggestion
  • Strong pattern recognition
  • Consistent data formatting
  • Safe to apply automatically

Medium Confidence 🟡

  • Good pattern match but some uncertainty
  • Minor inconsistencies in data
  • Review recommended before applying
  • Usually accurate but verify edge cases

Low Confidence 🔴

  • Uncertain or conflicting patterns
  • Mixed data types detected
  • Manual review strongly recommended
  • Consider keeping current type

Common AI Insights

Date Recognition

  • Identifies unusual date formats
  • Recognizes international date conventions
  • Detects timestamps and partial dates
  • Handles mixed date formats

Numeric Patterns

  • Distinguishes between IDs and quantities
  • Identifies currency without symbols
  • Recognizes percentages in decimal form
  • Detects measurement units

Text Classification

  • Identifies categorical vs. free text
  • Recognizes codes and identifiers
  • Detects boolean-like text patterns
  • Classifies structured vs. unstructured text

Applying AI Suggestions

Bulk Application

Apply All Suggestions

  • Fastest option for trusted data
  • Best for high-confidence suggestions
  • Applies all recommendations at once
  • Can be undone with "Reset All Types"

Selective Application

  • Review each suggestion individually
  • Apply only high-confidence changes
  • Skip uncertain recommendations
  • Maintain control over critical columns

Individual Review Process

  1. Examine each suggestion - Read the AI reasoning
  2. Check confidence level - Prioritize high-confidence changes
  3. Verify with data - Look at actual column values
  4. Apply or skip - Make informed decisions
  5. Test formatting - Verify results look correct

Handling Conflicts

When AI suggestions conflict with your knowledge:

  • Trust domain expertise - You know your data best
  • Consider context - AI might miss business logic
  • Test both options - Try the suggestion and compare
  • Document decisions - Note why you chose differently

Advanced Features

Multi-Sheet Analysis

For Excel files with multiple sheets:

  • Sheet-by-sheet analysis - Each sheet analyzed separately
  • Consistent suggestions - Similar columns get similar recommendations
  • Cross-sheet learning - AI considers patterns across sheets
  • Bulk application - Apply suggestions to all sheets

Iterative Improvement

AI profiling can be run multiple times:

  • After manual changes - Get suggestions for remaining columns
  • With new data - Re-analyze after adding rows
  • Quality checks - Verify previous decisions
  • Continuous improvement - Refine type assignments

Integration with Manual Overrides

AI suggestions work alongside manual type changes:

  • Preserves manual changes - Won't override your decisions
  • Suggests improvements - Recommends better alternatives
  • Complements detection - Enhances automatic algorithms
  • Provides validation - Confirms your type choices

Best Practices

Preparing for AI Analysis

  1. Clean column headers - Use descriptive, clear names
  2. Remove empty columns - Focus AI on meaningful data
  3. Check data quality - Fix obvious errors first
  4. Understand your data - Know what each column represents

Reviewing AI Suggestions

  1. Start with high confidence - Apply obvious improvements first
  2. Verify critical columns - Double-check important data types
  3. Test edge cases - Look at unusual values in suggested columns
  4. Document changes - Note why you accepted or rejected suggestions

Optimizing Results

  1. Use descriptive headers - Help AI understand column purpose
  2. Consistent formatting - Clean data gets better suggestions
  3. Provide context - Include units or descriptions in headers
  4. Iterate gradually - Make changes in small batches

Troubleshooting

Common Issues

"AI analysis failed"

  • Cause: Network connectivity or API issues
  • Solution: Check internet connection and try again

"No suggestions available"

  • Cause: All columns already optimally typed
  • Solution: This is actually good news - your data is well-formatted!

"Unexpected suggestions"

  • Cause: AI misunderstood data context
  • Solution: Review suggestions carefully and apply selectively

"Analysis taking too long"

  • Cause: Large number of columns or API delays
  • Solution: Wait up to 30 seconds, or refresh and try again

Performance Tips

  1. Limit column count - AI works best with < 50 columns
  2. Clean data first - Remove obviously problematic columns
  3. Stable internet - Ensure good connectivity for API calls
  4. Patience - Allow 10-30 seconds for complex analyses

Privacy & Security

Data Handling

  • Minimal data sharing - Only first 5 rows sent to AI
  • No data storage - Google doesn't retain your data
  • Encrypted transmission - Secure HTTPS connections
  • Optional feature - Can be skipped entirely

Best Practices

  1. Review sensitive data - Consider if any columns contain PII
  2. Use sample files - Test with non-sensitive data first
  3. Understand limitations - AI suggestions are recommendations, not requirements
  4. Maintain control - You decide what changes to apply

Next steps: After optimizing your column types with AI, learn about data editing to make changes to your data, or explore data viewing to better navigate your improved dataset.