DDC_Skills_for_AI_Agents_in_Construction dwg-to-excel
Convert AutoCAD DWG files (1983-2026) to Excel databases using DwgExporter CLI. Extract layers, blocks, attributes, and geometry data without Autodesk licenses.
install
source · Clone the upstream repo
git clone https://github.com/datadrivenconstruction/DDC_Skills_for_AI_Agents_in_Construction
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/datadrivenconstruction/DDC_Skills_for_AI_Agents_in_Construction "$T" && mkdir -p ~/.claude/skills && cp -r "$T/1_DDC_Toolkit/CAD-Converters/dwg-to-excel" ~/.claude/skills/datadrivenconstruction-ddc-skills-for-ai-agents-in-construction-dwg-to-excel && rm -rf "$T"
manifest:
1_DDC_Toolkit/CAD-Converters/dwg-to-excel/SKILL.mdsource content
DWG to Excel Conversion
Business Case
Problem Statement
AutoCAD DWG files contain valuable project data locked in proprietary format:
- Layer structures with drawing organization
- Block references with attribute data
- Text annotations and dimensions
- Geometric entities (lines, polylines, arcs)
- External references (xrefs)
Extracting this data typically requires AutoCAD licenses or complex programming.
Solution
DwgExporter.exe converts DWG files to structured Excel databases offline, without Autodesk licenses.
Business Value
- Zero license cost - No AutoCAD license required
- Legacy support - Reads DWG files from 1983 to 2026
- Data extraction - Layers, blocks, attributes, text, geometry
- PDF export - Generate drawings from DWG layouts
- Batch processing - Convert thousands of DWG files
Technical Implementation
CLI Syntax
DwgExporter.exe <input_dwg> [options]
Output Formats
| Output | Description |
|---|---|
| Excel database with all entities |
| PDF drawings from layouts |
Supported Versions
| Version Range | Description |
|---|---|
| R12 (1992) | Legacy DWG |
| R14 (1997) | AutoCAD 14 |
| 2000-2002 | DWG 2000 format |
| 2004-2006 | DWG 2004 format |
| 2007-2009 | DWG 2007 format |
| 2010-2012 | DWG 2010 format |
| 2013-2017 | DWG 2013 format |
| 2018-2026 | DWG 2018 format |
Examples
# Basic conversion DwgExporter.exe "C:\Projects\FloorPlan.dwg" # Export with PDF drawings DwgExporter.exe "C:\Projects\FloorPlan.dwg" sheets2pdf # Batch processing all DWG in folder for /R "C:\Projects" %f in (*.dwg) do DwgExporter.exe "%f" # PowerShell batch conversion Get-ChildItem "C:\Projects\*.dwg" -Recurse | ForEach-Object { & "C:\DDC\DwgExporter.exe" $_.FullName }
Python Integration
import subprocess import pandas as pd from pathlib import Path from typing import List, Optional, Dict, Any from dataclasses import dataclass from enum import Enum class DWGEntityType(Enum): """DWG entity types.""" LINE = "LINE" POLYLINE = "POLYLINE" LWPOLYLINE = "LWPOLYLINE" CIRCLE = "CIRCLE" ARC = "ARC" ELLIPSE = "ELLIPSE" SPLINE = "SPLINE" TEXT = "TEXT" MTEXT = "MTEXT" DIMENSION = "DIMENSION" INSERT = "INSERT" # Block reference HATCH = "HATCH" SOLID = "SOLID" POINT = "POINT" ATTRIB = "ATTRIB" ATTDEF = "ATTDEF" @dataclass class DWGEntity: """Represents a DWG entity.""" handle: str entity_type: str layer: str color: int linetype: str lineweight: float # Geometry (depends on entity type) start_x: Optional[float] = None start_y: Optional[float] = None end_x: Optional[float] = None end_y: Optional[float] = None # Block reference data block_name: Optional[str] = None rotation: Optional[float] = None scale_x: Optional[float] = None scale_y: Optional[float] = None # Text data text_content: Optional[str] = None text_height: Optional[float] = None @dataclass class DWGBlock: """Represents a DWG block definition.""" name: str base_point_x: float base_point_y: float entity_count: int is_dynamic: bool attributes: List[str] @dataclass class DWGLayer: """Represents a DWG layer.""" name: str color: int linetype: str is_on: bool is_frozen: bool is_locked: bool lineweight: float entity_count: int class DWGExporter: """DWG to Excel converter using DDC DwgExporter CLI.""" def __init__(self, exporter_path: str = "DwgExporter.exe"): self.exporter = Path(exporter_path) if not self.exporter.exists(): raise FileNotFoundError(f"DwgExporter not found: {exporter_path}") def convert(self, dwg_file: str, export_pdf: bool = False) -> Path: """Convert DWG file to Excel.""" dwg_path = Path(dwg_file) if not dwg_path.exists(): raise FileNotFoundError(f"DWG file not found: {dwg_file}") cmd = [str(self.exporter), str(dwg_path)] if export_pdf: cmd.append("sheets2pdf") result = subprocess.run(cmd, capture_output=True, text=True) if result.returncode != 0: raise RuntimeError(f"Export failed: {result.stderr}") # Output file is same name with .xlsx extension return dwg_path.with_suffix('.xlsx') def batch_convert(self, folder: str, include_subfolders: bool = True, export_pdf: bool = False) -> List[Dict[str, Any]]: """Convert all DWG files in folder.""" folder_path = Path(folder) pattern = "**/*.dwg" if include_subfolders else "*.dwg" results = [] for dwg_file in folder_path.glob(pattern): try: output = self.convert(str(dwg_file), export_pdf) results.append({ 'input': str(dwg_file), 'output': str(output), 'status': 'success' }) print(f"✓ Converted: {dwg_file.name}") except Exception as e: results.append({ 'input': str(dwg_file), 'output': None, 'status': 'failed', 'error': str(e) }) print(f"✗ Failed: {dwg_file.name} - {e}") return results def read_entities(self, xlsx_file: str) -> pd.DataFrame: """Read converted Excel as DataFrame.""" xlsx_path = Path(xlsx_file) if not xlsx_path.exists(): raise FileNotFoundError(f"Excel file not found: {xlsx_file}") return pd.read_excel(xlsx_file, sheet_name="Elements") def get_layers(self, xlsx_file: str) -> pd.DataFrame: """Get layer summary from converted file.""" df = self.read_entities(xlsx_file) if 'Layer' not in df.columns: raise ValueError("Layer column not found in data") summary = df.groupby('Layer').agg({ 'Handle': 'count' }).reset_index() summary.columns = ['Layer', 'Entity_Count'] return summary.sort_values('Entity_Count', ascending=False) def get_blocks(self, xlsx_file: str) -> pd.DataFrame: """Get block reference summary.""" df = self.read_entities(xlsx_file) # Filter to INSERT entities (block references) blocks = df[df['EntityType'] == 'INSERT'] if blocks.empty: return pd.DataFrame(columns=['Block_Name', 'Count']) summary = blocks.groupby('BlockName').agg({ 'Handle': 'count' }).reset_index() summary.columns = ['Block_Name', 'Count'] return summary.sort_values('Count', ascending=False) def get_text_content(self, xlsx_file: str) -> pd.DataFrame: """Extract all text content from DWG.""" df = self.read_entities(xlsx_file) # Filter to text entities text_types = ['TEXT', 'MTEXT', 'ATTRIB'] texts = df[df['EntityType'].isin(text_types)] if 'TextContent' in texts.columns: return texts[['Handle', 'EntityType', 'Layer', 'TextContent']].copy() return texts[['Handle', 'EntityType', 'Layer']].copy() def get_entity_statistics(self, xlsx_file: str) -> Dict[str, int]: """Get entity type statistics.""" df = self.read_entities(xlsx_file) if 'EntityType' not in df.columns: return {} return df['EntityType'].value_counts().to_dict() def extract_block_attributes(self, xlsx_file: str, block_name: str) -> pd.DataFrame: """Extract attributes from specific block type.""" df = self.read_entities(xlsx_file) # Find block references blocks = df[(df['EntityType'] == 'INSERT') & (df['BlockName'] == block_name)] # Find associated attributes # Attributes typically follow their parent INSERT in handle order result_data = [] for _, block in blocks.iterrows(): block_handle = block['Handle'] block_data = { 'Block_Handle': block_handle, 'X': block.get('InsertX', 0), 'Y': block.get('InsertY', 0), 'Rotation': block.get('Rotation', 0) } # Add any attribute columns for col in df.columns: if col.startswith('Attr_'): block_data[col] = block.get(col) result_data.append(block_data) return pd.DataFrame(result_data) class DWGAnalyzer: """Advanced DWG analysis tools.""" def __init__(self, exporter: DWGExporter): self.exporter = exporter def analyze_drawing_structure(self, dwg_file: str) -> Dict[str, Any]: """Analyze complete drawing structure.""" xlsx = self.exporter.convert(dwg_file) df = self.exporter.read_entities(str(xlsx)) analysis = { 'file': dwg_file, 'total_entities': len(df), 'layers': self.exporter.get_layers(str(xlsx)).to_dict('records'), 'entity_types': self.exporter.get_entity_statistics(str(xlsx)), 'blocks': self.exporter.get_blocks(str(xlsx)).to_dict('records') } # Calculate extents if coordinates available if 'X' in df.columns and 'Y' in df.columns: analysis['extents'] = { 'min_x': df['X'].min(), 'max_x': df['X'].max(), 'min_y': df['Y'].min(), 'max_y': df['Y'].max() } return analysis def compare_drawings(self, dwg1: str, dwg2: str) -> Dict[str, Any]: """Compare two DWG files.""" xlsx1 = self.exporter.convert(dwg1) xlsx2 = self.exporter.convert(dwg2) df1 = self.exporter.read_entities(str(xlsx1)) df2 = self.exporter.read_entities(str(xlsx2)) layers1 = set(df1['Layer'].unique()) if 'Layer' in df1.columns else set() layers2 = set(df2['Layer'].unique()) if 'Layer' in df2.columns else set() return { 'file1': dwg1, 'file2': dwg2, 'entity_count_diff': len(df2) - len(df1), 'layers_added': list(layers2 - layers1), 'layers_removed': list(layers1 - layers2), 'common_layers': list(layers1 & layers2) } def find_duplicates(self, xlsx_file: str, tolerance: float = 0.001) -> pd.DataFrame: """Find duplicate entities at same location.""" df = self.exporter.read_entities(xlsx_file) if 'X' not in df.columns or 'Y' not in df.columns: return pd.DataFrame() # Round coordinates for grouping df['X_rounded'] = (df['X'] / tolerance).round() * tolerance df['Y_rounded'] = (df['Y'] / tolerance).round() * tolerance # Find duplicates duplicates = df[df.duplicated( subset=['EntityType', 'Layer', 'X_rounded', 'Y_rounded'], keep=False )] return duplicates.sort_values(['X_rounded', 'Y_rounded']) # Convenience functions def convert_dwg_to_excel(dwg_file: str, exporter_path: str = "DwgExporter.exe") -> str: """Quick conversion of DWG to Excel.""" exporter = DWGExporter(exporter_path) output = exporter.convert(dwg_file) return str(output) def batch_convert_dwg(folder: str, exporter_path: str = "DwgExporter.exe", include_subfolders: bool = True) -> List[str]: """Batch convert all DWG files in folder.""" exporter = DWGExporter(exporter_path) results = exporter.batch_convert(folder, include_subfolders) return [r['output'] for r in results if r['status'] == 'success']
Output Structure
Excel Sheets
| Sheet | Content |
|---|---|
| Elements | All DWG entities with properties |
| Layers | Layer definitions |
| Blocks | Block definitions |
| Layouts | Drawing layouts/sheets |
Entity Columns
| Column | Type | Description |
|---|---|---|
| Handle | string | Unique entity handle |
| EntityType | string | LINE, CIRCLE, INSERT, etc. |
| Layer | string | Layer name |
| Color | int | Color index (0-256) |
| Linetype | string | Linetype name |
| Lineweight | float | Line weight in mm |
| X, Y, Z | float | Entity coordinates |
| BlockName | string | For INSERT entities |
| TextContent | string | For TEXT/MTEXT |
Quick Start
# Initialize exporter exporter = DWGExporter("C:/DDC/DwgExporter.exe") # Convert single file xlsx = exporter.convert("C:/Projects/Plan.dwg") print(f"Output: {xlsx}") # Read and analyze df = exporter.read_entities(str(xlsx)) print(f"Total entities: {len(df)}") # Get layer statistics layers = exporter.get_layers(str(xlsx)) print(layers) # Get block usage blocks = exporter.get_blocks(str(xlsx)) print(blocks) # Extract text annotations texts = exporter.get_text_content(str(xlsx)) for _, row in texts.iterrows(): print(f"{row['Layer']}: {row.get('TextContent', 'N/A')}")
Common Use Cases
1. Layer Audit
exporter = DWGExporter() xlsx = exporter.convert("drawing.dwg") layers = exporter.get_layers(str(xlsx)) # Check for non-standard layers standard_layers = ['0', 'WALLS', 'DOORS', 'WINDOWS', 'DIMENSIONS'] non_standard = layers[~layers['Layer'].isin(standard_layers)] print("Non-standard layers:", non_standard['Layer'].tolist())
2. Block Schedule
# Extract all door blocks with attributes doors = exporter.extract_block_attributes(str(xlsx), "DOOR") print(doors[['Block_Handle', 'Attr_DOOR_TYPE', 'Attr_DOOR_SIZE']])
3. Drawing Comparison
analyzer = DWGAnalyzer(exporter) diff = analyzer.compare_drawings("rev1.dwg", "rev2.dwg") print(f"Entities added: {diff['entity_count_diff']}") print(f"New layers: {diff['layers_added']}")
Integration with DDC Pipeline
# Full pipeline: DWG → Excel → Analysis → Report from dwg_exporter import DWGExporter, DWGAnalyzer # 1. Convert DWG exporter = DWGExporter("C:/DDC/DwgExporter.exe") xlsx = exporter.convert("project.dwg") # 2. Analyze structure analyzer = DWGAnalyzer(exporter) analysis = analyzer.analyze_drawing_structure("project.dwg") # 3. Generate report print(f"Drawing: {analysis['file']}") print(f"Entities: {analysis['total_entities']}") print(f"Layers: {len(analysis['layers'])}") print(f"Blocks: {len(analysis['blocks'])}")
Best Practices
- Check DWG version - Older files may have limited data
- Validate layer structure - Clean up before processing
- Handle external references - Bind xrefs if needed
- Batch overnight - Large files take time
- Verify entity counts - Compare with AutoCAD if possible
Resources
- GitHub: cad2data Pipeline
- Video Tutorial: DWG to Excel Pipeline
- DDC Book: Chapter 2.4 - CAD Data Extraction