Aller au contenu principal

Vector Data

Vector data is central to geographic information systems. It represents spatial features with geometry types such as points, lines, and polygons. It follows international OGC standards, such as Simple Features, to ensure cross-platform data interoperability. Mainstream formats include GeoPackage for lightweight single-file storage, PostGIS for enterprise databases, and KML for network exchange. Vector data supports coordinate system definitions and efficient storage, such as WKB encoding. It is suitable for cartography, spatial analysis, and mobile applications, improving data sharing and visualization efficiency.

Vector Data Basics

Vector data uses three basic geometry types, points, lines, and polygons, to represent spatial data. These geometry types can model many kinds of real-world geographic features.

Vector data has important characteristics that distinguish it from other similar files. The characteristics we can specifically identify are:

  • Geometry: only specified geometry types can exist, mainly points, lines, and polygons.
  • Spatial reference: vector data has a spatial reference system, defined by geographic coordinate systems and projected coordinate systems.
  • Feature: a single geographic entity that encapsulates geometry and attributes, such as a point, line, or polygon. We call this a feature.

Geometry Types in Vector Data

2D Geometry Types

Simple Features are generally called geometries or simple features.

TypeDiagramDescription
Point, MultiPointUsed to represent specific locations, such as landmarks, cities, or other points of interest.
LineString, MultiLineStringCreated by connecting multiple points in order; used to represent linear features such as paths, rivers, and transportation routes.
Polygon, MultiPolygon, TriangleCreated from closed lines; used to represent geographic features with area, such as lakes, islands, and administrative regions.

Note: Other types include PolyhedralSurface and Triangulated Irregular Network, or TIN. Current software does not support them, so they are not described here.

Geometry Coordinates

Geometry coordinates can be 2D (x,y)(x, y), 3D (x,y,z)(x, y, z), or 4D (x,y,z,m)(x, y, z, m). The mm value is part of a linear referencing system. Coordinates can also be 2D with an mm value, (x,y,m)(x, y, m). Three-dimensional geometries are marked with Z after the geometry type. Geometries with a linear referencing system are marked with M after the geometry type. Empty geometries without coordinates can be specified by using a symbol after the type name.

In iXGIS, using Z values and M values requires specifying them when creating vector data.

Geometry Object Representation Formats

WKB (Well-Known Binary) and WKT (Well-Known Text) are two commonly used geometry object representation formats. They are widely used in GIS, databases such as PostGIS, spatial data transfer, and standardized exchange.

  • WKT (Well-Known Text) is a text-based format used to describe the spatial structure of geometry objects and is easy for people to read.
  • WKB (Well-Known Binary) is a binary encoding format for representing geometry objects. It is the binary equivalent of Well-Known Text (WKT), designed to support fast exchange and storage of geometry data between different systems. WKB defines how basic geometric shapes such as points, lines, and polygons are stored. It is highly compatible with shapefiles and supports efficient data transfer between spatial databases and GIS systems.

To transfer data quickly between the server and the user's browser, iXGIS uses WKB format. This compresses data size and allows data to be displayed faster in the user's browser.

For example, for the following point feature:

POINT (10 20)

The WKB representation is:

010100000000000000000024400000000000003440

Coordinate Systems in Vector Data

A spatial reference system (SRS) or coordinate reference system (CRS) provides a standard framework for locating specific positions on Earth's surface in a geographic information system (GIS). These systems are the result of applying coordinate systems and analytic geometry to geospatial data management.

When geometry coordinates are represented, the spatial reference system used by the geometry must be declared so that the feature can be expressed correctly.

  • Geodetic/geographic coordinate system (GCS): a spherical coordinate system that directly measures positions on Earth, modeled as a sphere or ellipsoid, using latitude, the degrees north or south of the equator, and longitude, the degrees west or east of the prime meridian.
  • Planar/grid/projected coordinate system (PCS): a standardized Cartesian coordinate system that models Earth, or more commonly a large region of it, as a plane. There are thousands of map projections. Each projection is based on a specific mathematical model and is designed to balance accurate representation of regional characteristics with map usability. Common examples include:
    • Universal Transverse Mercator (UTM): the world is divided into 60 or 120 longitudinal zones, each using a transverse Mercator projection. It is suitable for accurate measurements on medium-scale maps.
    • Web Mercator (WGS 84 / Pseudo-Mercator): mainly used for web map services such as Tianditu and OpenStreetMap. It optimizes global-scale visual display but has large distortion in high-latitude regions.
Distance Measurement and Web Mercator

Although imagery in online maps such as Tianditu, Amap, and Baidu Maps is displayed in Web Mercator, Web Mercator has large distance distortion. Therefore, when using the distance measurement function, distances are not calculated in Web Mercator. They are calculated with Vincenty's formulae, and this distance is consistent with distances measured in real life.

Overview of the OGC Simple Features Specification

In iXGIS, features in vector data follow the OGC Simple Features Access (SFA) design specification. A brief overview follows.

SFA Design Concept

The design concept of the OGC Simple Features specification (SFA) can be summarized as: "Use the most general mathematical model to solve the broadest set of geospatial representation problems."

Its emergence moved geographic information from early closed, proprietary formats into the modern era of open and database-oriented data. The following is a deeper explanation of its core design concepts and ideas.


  1. Core design concept: the least common denominator principle

SFA's core idea is to define a minimum subset of geometries that all systems can understand.

  • Cross-platform consistency: databases such as Oracle and PostgreSQL, and software such as ArcGIS and QGIS, can recognize the same POINT or POLYGON.
  • Universality at the cost of complexity: it omits parametric curves that are difficult to calculate consistently, such as circular arcs and splines, and requires all geometries to consist of straight-line segments. Although this compromises precision in some cases, it greatly reduces the algorithmic difficulty of spatial analysis such as intersection and union.

  1. Spatial object model: component-based hierarchy

SFA uses object-oriented design to build a rigorous inheritance hierarchy of geometry objects.

  • Base class (Geometry): the ancestor of all objects, defining common spatial properties such as SRID and dimension.
  • Atomic types: Point, LineString, and Polygon.
  • Collection types (GeometryCollection): containers such as MultiPoint, MultiLineString, and MultiPolygon are designed to handle complex geographic entities, allowing one feature to consist of multiple physical parts.

  1. Core idea: decoupling and association between geometry and attributes

SFA divides geographic information into two core parts and defines how they interact:

  • Geometry: a purely mathematical description, including coordinates and topology.
  • Feature: a real-world entity. One feature equals geometry plus business attributes.
  • Association mechanism: in a database, this appears as one row per feature. One column, the geometry column, stores the geometry, while other columns store attributes such as name, population, and type. This design allows SQL to process geographic data directly.

  1. Mathematical definition of topology and spatial relationships (DE-9IM)

SFA defines not only shapes but also how shapes interact with one another.

It introduces the DE-9IM (Dimensionally Extended 9-Intersection Model) design concept. By evaluating intersections among the interior, boundary, and exterior of two geometry objects, it defines eight standard spatial relationship operations:

  • Equals, Disjoint, Intersects, Touches, Crosses, Within, Contains, Overlaps.

Significance: this design converts questions such as "which provinces does this river pass through?" into standard mathematical Boolean operations.


  1. Standard serialized representations (WKT and WKB)

To allow different systems to exchange these simple features, SFA defines two highly successful representation methods:

  1. WKT (Well-Known Text):
    • Idea: human-readable.
    • Example: POLYGON((0 0, 10 0, 10 10, 0 10, 0 0)). This allows developers to write geometries directly in code or on the command line.
  2. WKB (Well-Known Binary):
    • Idea: efficient for machines.
    • Use: fast transfer between databases and applications as binary streams, reducing parsing overhead.
    • Note: to reduce data transfer from server to browser, iXGIS features use WKB to represent geometry.

How to Understand "Simple"

In geographic information systems (GIS), the OGC Simple Features Access (SFA) specification is foundational. Here, "Simple" does not mean the functionality is primitive. It is a precise term from geometry and computer science.

We can understand this "simple" from three dimensions.


  1. Geometric definition: non-self-intersecting

In mathematics and topology, a simple geometry is an object that does not intersect itself.

  • Simple curve: for example, a line from A to B that does not cross itself like a twist.
  • Simple polygon: a boundary that does not self-intersect and has no holes, or whose hole boundaries do not intersect the exterior boundary.

Core logic: if a polygon edge crosses itself, it is no longer simple, and the algorithmic complexity of processing it increases geometrically. The specification requires geometries to be simple to ensure unambiguous topological relationships.


  1. Dimensional constraints: linear interpolation in a two-dimensional plane

"Simple" is also reflected in constraints on how shapes are described:

  • Linear interpolation: all curves are composed of straight-line segments. Even a circle must be represented under SFA as a polygon with many vertices. Complex mathematical curves such as circular arcs, Bezier curves, and splines are not supported.
  • Dimensional constraints: the focus is mainly on 0D points, 1D lines, and 2D polygons. Although the Z axis, or height, is now supported, the core computational logic is still based on topological relationships in a two-dimensional plane.

  1. Data model: simplification of the object model

From a software engineering perspective, "simple features" means complex real-world objects are abstracted into a standard and predictable object structure.

  • Standard definitions: it defines familiar Point, LineString, and Polygon types and their collections, such as MultiPoint.
  • Easy storage: because the definitions are simple, these features can be stored conveniently in relational database tables, such as PostGIS and MySQL, or represented by human-readable strings such as WKT (Well-Known Text).
Feature TypeDescriptionExample (WKT)
PointA single coordinate pointPOINT(1 1)
LineStringA sequence of points connected by straight-line segmentsLINESTRING(0 0, 1 1, 2 1)
PolygonClosed rings, including an exterior ring and optional interior ringsPOLYGON((0 0, 4 0, 4 4, 0 4, 0 0))

Summary

The real meaning of "Simple" is: geometric objects based on the Euclidean plane, with non-self-intersecting boundaries, composed of straight-line segments.

This simplification is an extremely successful strategy. It discards extremely complex nonlinear mathematical models in exchange for computational efficiency and universal support across systems, including databases, web software, and desktop software.

Later Extensions of SFA

Subtitle: from "straight lines" to "circular arcs": the evolution of geographic information specifications.

During the digitization of geographic information systems (GIS), OGC SFA (Simple Features Access) once acted like a strict teacher, requiring the world to consist only of points, lines, and polygons, and requiring lines to be straight. This simplification made database processing very fast, but it left one limitation: the real Earth is full of curves.

To bridge this gap, OGC used later extensions, mainly by incorporating the ISO SQL/MM standard, to make "simple features" no longer so simple.


  1. Breaking the straight-line limitation: nonlinear geometry

Early SFA required every curve to be represented by many small straight-line segments, known as linear interpolation. Later extensions introduced mathematical functions to describe shapes.

Key New Members

  • CircularString: no longer an approximate circle made from hundreds or thousands of points, but a mathematical circular arc precisely defined by three points: start point, midpoint on the arc, and end point.
  • CompoundCurve: a hybrid curve. Imagine a highway segment that uses LineString on straight sections and switches seamlessly to CircularString on curved sections.
  • CurvePolygon: a polygon whose boundary can be made of arcs. For example, a perfectly circular fountain can be represented precisely in a database using only center-and-radius logic or three boundary points.

  1. Adding depth and measure: Z axis and M values

The original SFA handled two-dimensional and three-dimensional data in a somewhat mixed way. The extended specification formally established the XYZM four-dimensional model.

  • Z (height/depth): allows geographic features to move from flat "paper-based" representation to three-dimensional digital twins.
  • M (measure): an elegant design. It does not represent a spatial position but a logical measure.
  • Use case: on a railway line, even if a train's longitude/latitude coordinates (x,y)(x, y) change, its mileage value MM is a fixed business reference. This lets users query "there is a signal light at kilometer 105 of the Beijing-Shanghai High-Speed Railway" without requiring precise longitude and latitude.

  1. A standardized topology laboratory: DE-9IM

To ensure different software uses consistent logic for judging "intersects" and "contains", the specification strengthened DE-9IM (Dimensionally Extended 9-Intersection Model).

It works like a DNA test between geometries. By analyzing intersections among the interior, boundary, and exterior of two objects, it uses a nine-character matrix, such as T*****FF*, to define spatial relationships. This ensures that an "overlaps" result calculated in PostGIS is exactly consistent with the result in ArcGIS or QGIS.


  1. Why are these extensions worth the effort?

You may ask: if arcs can be broken into line segments, why introduce this complexity?

  1. Storage efficiency: storing a three-point arc uses more than 90% less space than storing a 1,000-point approximate circular polyline.
  2. Calculation accuracy: in precision surveying and mapping, errors introduced by line-segment simulation accumulate during spatial overlay.
  3. Industry semantics: transportation, water engineering, and building information modeling (BIM) are natively curve-based. Direct curve support allows GIS to integrate smoothly with these industries.

  1. Current status: balance between ideal and reality

Although the SQL/MM specification is comprehensive, reality has a performance gap:

  • Databases such as PostGIS: already fully support these extensions and can store and read them.
  • Frontend rendering libraries such as OpenLayers and Leaflet: because of browser drawing performance limitations, they often linearize arcs before displaying them.
  • Data exchange formats such as GeoJSON: unfortunately, the most popular GeoJSON format still supports only linear paths for simple features and does not support native arcs.

The move from SFA to SQL/MM extensions is GIS's transition from "drawing maps" to "simulating reality". Current specifications preserve the efficiency of simple features while also enabling precise representation of a complex world.

Vector Data Storage

Vector data can be stored and exchanged in many formats, each with specific uses, advantages, and limitations. In the design of iXGIS, we try to minimize the need to focus on formats themselves and instead reduce differences among formats so users can focus on vector data. However, understanding the differences among formats helps us better understand vector data. The following introduces several common data formats.

ESRI Shapefile

ESRI Shapefile (.shp), or shapefile for short, was developed by Environmental Systems Research Institute (ESRI). It is an open, non-topological spatial data format used to store the geometric location and attribute information of geographic features. Geographic features in a shapefile can be represented as points, lines, or polygons. A workspace containing shapefiles can also contain dBASE tables, which store additional attributes that can be joined to shapefile features.

Shapefile Components

A shapefile refers to a file-based storage method. In practice, the format consists of multiple files. Three files are required to form a shapefile:

  • .shp - geometry format, used to store the geometric entities of features.
  • .shx - geometry index format, an index of geometry locations that records the position of each geometry in the SHP file and improves forward or backward search efficiency.
  • .dbf - attribute data format, storing attribute data for each geometry in dBASE IV table format.

A group of files representing the same dataset should share the same filename prefix. For example, to store geometry and attribute data about a lake, lake.shp, lake.shx, and lake.dbf are all required. The file with the .shp suffix is the "real" shapefile, but data is incomplete if only that file exists. The other two files must be included to form a complete geographic dataset. In addition to these three required files, optional files can enhance spatial data representation.

The following files are optional. Note that .prj is generally required and must be complete; otherwise, the spatial position of vector data may be incorrect:

  • .prj stores coordinate system parameters.
  • .sbn is a spatial index used to optimize queries.
  • .sbx optimizes loading time.
  • .xml stores associated metadata.
XML Files

In iXGIS, shapefile metadata is stored in the database and associated XML files are neither stored nor read. Therefore, XML files are not recognized as data in iXGIS data management and are recognized only as ordinary files.

Shapefile Limitations

Shapefiles have the following limitations:

  • Shapefile size limit: 2 GB.
  • Maximum field name length: 10 characters.
  • Maximum number of fields: 1,024.
  • Null values apply only to the Date field type, not to numeric or text field types in shapefiles.
  • Shapefiles cannot store topology information or relationships.
  • Shapefile and dBASE files cannot store non-English characters by default.
  • In field view, fields can be added, deleted, or copied, but field properties cannot be modified after fields are saved.

Because of these limitations, we do not recommend using shapefiles as your preferred file format. We recommend using vector data stored in PostGIS to access the full functionality of the software.

GeoPackage Vector Data

GeoPackage is an open, standardized, platform-independent, and self-describing database format for storing and transferring geospatial information. It is based on the SQLite database format and is developed and maintained by the Open Geospatial Consortium (OGC). The main goal of GeoPackage is to allow users to easily share and publish geospatial data across different GIS software, devices, and platforms.

The main characteristics are:

  • Support for multiple data types: GeoPackage supports various types of geospatial data, including vector data such as points, lines, and polygons; raster data such as satellite imagery and map tiles; and other geographic information such as feature attribute tables.
  • Single-file storage: all geospatial data and related information are stored in a single SQLite database file. This makes data management, sharing, and transfer very convenient.
  • Open standard: as an open international standard, GeoPackage is designed to promote geospatial data interoperability and sharing. It avoids the limitations of proprietary formats and ensures long-term data accessibility and usability.
  • Broad application support: because it is open and standardized, many GIS software packages and tools support GeoPackage, including but not limited to ArcGIS, QGIS, and Google Earth.
  • Mobile-friendly: GeoPackage is especially suitable for mobile applications because it is based on the lightweight SQLite database and is easy to process and use on mobile devices.

When storing vector data, GeoPackage can provide capabilities beyond ESRI Shapefile. However, when storing raster data, it can only store data as PNG/JPG through a tile pyramid mechanism, which is suitable for web browsing but does not implement the raster data functions defined by OGC. Therefore, in the software, when data is saved as GeoPackage, only vector data can be saved. In the cloud resource manager, it is displayed as GeoPackage Vector (*gpv).

Vector Data Storage

  • Storage method:

    • Each vector layer is an independent SQLite table.
    • One column in the table stores geometry objects using WKB, a binary format.
    • The remaining columns are attribute fields.
  • Key tables:

    • gpkg_contents: records metadata for all layers, including name, type, and spatial extent.
    • gpkg_geometry_columns: records the geometry column name, type, such as POINT or LINESTRING, and spatial reference system (SRS) for each vector layer.
  • Example structure:

CREATE TABLE roads (
id INTEGER PRIMARY KEY,
name TEXT,
geom BLOB -- Stores WKB geometry
);
  • Register metadata:
INSERT INTO gpkg_contents (
table_name, data_type, identifier, srs_id, min_x, min_y, max_x, max_y
) VALUES (
'roads', 'features', 'roads', 4326, 120.1, 30.1, 120.9, 30.9
);

INSERT INTO gpkg_geometry_columns (
table_name, column_name, geometry_type_name, srs_id, z, m
) VALUES (
'roads', 'geom', 'LINESTRING', 4326, 0, 0
);

Raster Data Storage

GeoPackage stores raster data using a tile pyramid mechanism, which is especially suitable for map tiling services such as web map tiles.

  • Storage method:

    • A raster layer consists of multiple tile images, such as PNG or JPEG.
    • Tiles are organized by zoom level, tile row, and tile column.
    • Image data is stored in BLOB fields.
  • Key tables:

    • gpkg_tile_matrix_set: records the spatial reference and bounding box of a tile layer.
    • gpkg_tile_matrix: records resolution, row count, column count, and related information at different zoom levels.
    • Tile image table: usually named after the layer and containing zoom_level, tile_column, tile_row, and tile_data.
  • Example tile table structure:

CREATE TABLE maptiles (
id INTEGER PRIMARY KEY,
zoom_level INTEGER NOT NULL,
tile_column INTEGER NOT NULL,
tile_row INTEGER NOT NULL,
tile_data BLOB NOT NULL -- Binary image data
);

PostGIS Vector Data

PostGIS is the spatial extension for PostgreSQL. It enables the database to support geospatial data types, indexes, and functions. Vector data is the most commonly used form of data in PostGIS and is used to represent spatial features such as points, lines, and polygons.

What Is PostGIS?

PostGIS began as a project by Refractions Research (http://refractions.net), a geospatial consulting company in Victoria, Canada. It has since been adopted and improved by governments, universities, public organizations, and other companies.

PostGIS is based on the PostgreSQL object-relational database management system (ORDBMS), which provides transaction support, GiST index support for spatial objects, and ready-to-use query management capabilities.

Why PostGIS?

PostGIS supports OGC SFS. OGC (Open Geospatial Consortium) is the authoritative global organization for geographic information standards. Simple Features Specification (SFS) is one of the earliest and most fundamental standards published by OGC. It was first released in 1999. Its goals are to:

  • Define a common data model for geometries.
  • Standardize how vector data is stored, operated on, and exchanged.
  • Make software from different vendors, such as Oracle Spatial, ESRI, and PostGIS, interoperable.

The advantages of supporting OGC SFS are:

  • Standardization: data can be exchanged among different databases, such as Oracle Spatial, SQL Server, PostGIS, and SpatiaLite.
  • Extensibility: GIS clients such as QGIS and ArcGIS can connect directly to PostGIS and read standard geometries.
  • Complete functionality: hundreds of functions cover most operations defined by OGC.
  • Future compatibility: as OGC and ISO standards evolve, PostGIS continues to extend accordingly.

Core Definitions of Geometry Data in SFS

A. Supported geometry types. Simple geometries consist of the following seven base types:

Geometry TypeEnglish NameDescription
PointPointA position (X, Y).
LineStringLineAn ordered set of points that does not self-intersect.
PolygonPolygonA closed ring, with an exterior ring and optional interior rings.
MultiPointMultiPointA collection of points.
MultiLineStringMultiLineStringMultiple LineStrings.
MultiPolygonMultiPolygonMultiple Polygons.
GeometryCollectionGeometryCollectionA combination of arbitrary geometries.

All these geometry types can be represented with WKT (Well-Known Text) or WKB (Well-Known Binary).

B. Geometry properties

OGC SFS requires geometry objects to satisfy:

  • Simplicity: for example, a LineString does not self-intersect and a Polygon does not overlap itself.
  • Validity: geometry is closed and correct.
  • Dimension: points are 0D, lines are 1D, and polygons are 2D.

PostGIS was the first open-source spatial database extension to fully pass OGC SFS conformance tests. Its goal is:

"Allow users to confidently use PostGIS as a standard, reliable, and interoperable spatial database."

In PostGIS:

  • All geometry types comply with the SFS standard.
  • All spatial predicates and functions follow OGC specifications as much as possible.
  • WKT/WKB input and output are supported.
  • SQL/MM standard interfaces are available.

Supported Geometry Types

In PostGIS, vector data is based on the OGC Simple Features standard, with the following core data types:

Data TypeMeaningExample
POINTSingle pointCity coordinates, sampling points
LINESTRINGPolyline, or ordered point setRivers, road centerlines
POLYGONPolygon, or closed areaAdministrative regions, lakes
MULTIPOINTMultiple pointsSensor group locations
MULTILINESTRINGMultiple polylinesRoad network
MULTIPOLYGONMultiple polygons, usually with holesDistributed protected areas
GEOMETRYCOLLECTIONMixed geometries, rarely usedA combination of multiple geometries

In addition, PostGIS supports polyhedral surfaces, curved surfaces, and TIN, making it comparable to Esri geodatabases.

KML Format

Keyhole Markup Language (KML) (.kml) is an XML-based format for storing geographic data and related content. It is an Open Geospatial Consortium (OGC) standard. KML is convenient for publishing on the Internet and can be viewed by many free applications such as Google Earth and ArcGIS Explorer, so it is often used to share geographic data with non-GIS users. Many domestic GIS software products also support exporting user markers as KML files.

KML files use either the .kml extension or the .kmz extension, which indicates a compressed KML file.

iXGIS cannot directly edit KML files, but it supports reading and exporting them. Users can upload and import KML files directly in iXGIS, and can also export vector data as KML files.

XML files can be opened directly with a text editor. A simple KML file example is shown below:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Placemark>
<name>A Place</name>
<description>This is a description of the place.</description>
<Point>
<coordinates>-122.0822035425683,37.42228990140251,0</coordinates>
</Point>
</Placemark>
</kml>

GeoJSON

GeoJSON is an open standard format based on JSON (JavaScript Object Notation) for representing geospatial data. It is both a data exchange format and the de facto standard for Web GIS visualization.

GeoJSON is widely used in browser maps such as Leaflet and OpenLayers, web APIs such as RESTful services, spatial databases such as PostGIS and MongoDB, and mobile GIS applications.

GeoJSON Characteristics

  • Plain text format, easy to read and edit.
  • Direct support for UTF-8 encoding.
  • Highly compatible with JavaScript, Python, PostGIS, and NoSQL.
  • Supports integrated vector features and attributes.
  • Can be rendered directly on the frontend.

GeoJSON File Structure

A GeoJSON file is a JSON object and must contain a type field indicating the data type.

The four most common structures are:

TypeDescription
PointSingle point
LineStringPolyline
PolygonPolygon
FeatureSingle feature with attributes
FeatureCollectionCollection of multiple features, most common

GML

GML (Geography Markup Language) is an open standard based on XML (eXtensible Markup Language) for representing, storing, and exchanging geospatial information. It is developed by OGC (Open Geospatial Consortium) and is intended to serve as a universal spatial data language across platforms and systems.

GML is the default exchange format for OGC WFS (Web Feature Service), and clients such as QGIS and ArcGIS can parse it directly.

Main Characteristics of GML

  • Based on XML, with a clear and extensible structure.
  • Can describe geometry, attributes, coordinate reference systems, and topological relationships.
  • Supports rich geographic feature models.
  • Provides schemas for constraints to ensure data consistency.
  • Closely integrated with OGC standards such as WFS and WMS.

GML is the preferred standard for many governments, land and resources departments, surveying and mapping departments, and international data exchange.

Core Components of GML

The basic unit of GML is a feature. Each feature includes:

  1. Property
  • Examples include name, category, and ID.
  1. Geometry
  • Points, lines, and polygons, supporting OGC Simple Features.
  • Complex geometries such as compound curves and surfaces.
  1. Coordinate Reference System (CRS)
  • Can define EPSG codes or custom CRS definitions.
  1. Identifier and Metadata
  • Feature ID and schema definition.

Example: a simple GML point

<gml:Point srsName="EPSG:4326">
<gml:coordinates>116.4,39.9</gml:coordinates>
</gml:Point>

Example: feature plus attributes and geometry

<gml:featureMember>
<myNS:CityFeature gml:id="city.1">
<myNS:name>Beijing</myNS:name>
<myNS:population>21540000</myNS:population>
<myNS:location>
<gml:Point srsName="EPSG:4326">
<gml:coordinates>116.4,39.9</gml:coordinates>
</gml:Point>
</myNS:location>
</myNS:CityFeature>
</gml:featureMember>

Geometry Types Supported by GML

GML follows the OGC Simple Features model and supports:

  • Point
  • LineString
  • Polygon
  • MultiPoint
  • MultiLineString
  • MultiPolygon
  • GeometryCollection
  • TopoGeometry