Skip to content

Independent Smallholders (SPOC)

SPOC Portal

Overview

Property Value
Endpoint GET https://api.mspots.org.my/api/spoc/get-spoc-list-public
Portal URL https://emspo.org.my/public/spoc
Total Records ~306,000
Update Frequency Daily
Authentication None (public)

Parameters

Parameter Type Required Default Description
page integer No 1 Page number (ignored - returns all)
pageSize integer No 10 Records per page (ignored - returns all)

Note: Despite accepting pagination parameters, this endpoint returns ALL records (~306k) in a single response regardless of pageSize.

Response Structure

{
  "data": {
    "page_size": "10",
    "results": [
      {
        "entity_smallholder_id": 73042,
        "entity_name": "JELI/ GUA MUSANG/ KRAI/ MACHANG",
        "mpob_no": "832452001098",
        "planted_area": "2.4300",
        "certified_area": "2.4300",
        "gps_coordinate_lat": "5.551060",
        "gps_coordinate_long": "101.920321",
        "geolocation": "5.551060, 101.920321",
        "name": "ROSNANI BINTI IBRAHIM",
        "stateId": 3,
        "state": "Kelantan",
        "daerahId": null,
        "daerah": null,
        "mukimId": null,
        "mukim": null,
        "lot_no": "14165",
        "cert_status": "DELAYED",
        "zoneId": 11,
        "zone": "Zon Timur 2",
        "cert_no": "MSPO-492/24",
        "spoc_code": "D1",
        "created_at": "2025-06-03T04:46:41.581Z",
        "updated_at": "2024-07-24T01:17:06.101Z",
        "track_id": "OPMC205946",
        "record_status": "NO CHANGE",
        "deleted": false
      }
    ]
  }
}

Response Fields

Field Type Description
entity_smallholder_id integer Unique smallholder plot ID
entity_name string Cluster/entity name (often district list)
mpob_no string MPOB registration number (12 digits)
planted_area string Total planted area in hectares
certified_area string MSPO certified area in hectares
gps_coordinate_lat string GPS latitude
gps_coordinate_long string GPS longitude
geolocation string Combined "lat, long" string
name string Smallholder name
stateId integer Malaysian state ID
state string Malaysian state name
daerahId integer/null District ID
daerah string/null District name
mukimId integer/null Mukim ID
mukim string/null Mukim/subdistrict name
lot_no string Land lot number
cert_status string Certification status
zoneId integer MSPO zone ID (1-11)
zone string MSPO zone name
cert_no string MSPO certificate number
spoc_code string SPOC cluster code (e.g., "A1", "D1")
created_at string Record creation timestamp (ISO 8601)
updated_at string Last update timestamp (ISO 8601)
track_id string Tracking ID (e.g., "OPMC205946")
record_status string Change status: "NO CHANGE" or "UPDATED"
deleted boolean Soft delete flag

Certification Status Values

Status Description
ACTIVE Currently certified
DELAYED Certification delayed/pending
EXPIRED Certification expired
SUSPENDED Certification suspended

MSPO Zones

Zone ID Zone Name States Covered
1 Zon Utara 1 Perlis, Kedah (partial), Perak (partial)
2 Zon Utara 2 Penang, Kedah (partial), Perak (partial)
3 Zon Tengah 1 Selangor, Kuala Lumpur
4 Zon Tengah 2 Negeri Sembilan, Melaka
5 Zon Selatan 1 Johor (partial)
6 Zon Selatan 2 Johor (partial)
7 Zon Timur 1 Pahang
8 Zon Timur 2 Terengganu, Kelantan
9 Zon Sabah 1 Sabah (partial)
10 Zon Sabah 2 Sabah (partial)
11 Zon Sarawak Sarawak

Raw Response → Tidy Data

Raw JSON (nested)

data.results[] → array of smallholder objects

Tidy Parquet Schema

+-------------------------+----------+-------------------------------------------+
| Column                  | Type     | Notes                                     |
+-------------------------+----------+-------------------------------------------+
| entity_smallholder_id   | Int64    | Primary key                               |
| entity_name             | String   |                                           |
| mpob_no                 | String   | 12-digit MPOB number                      |
| name                    | String   | Smallholder name                          |
| planted_area            | Float64  | Cast from string                          |
| certified_area          | Float64  | Cast from string                          |
| gps_coordinate_lat      | Float64  | Cast from string                          |
| gps_coordinate_long     | Float64  | Cast from string                          |
| state_id                | Int32    | Renamed from stateId                      |
| state                   | String   |                                           |
| daerah_id               | Int32    | Renamed from daerahId (nullable)          |
| daerah                  | String   | (nullable)                                |
| mukim_id                | Int32    | Renamed from mukimId (nullable)           |
| mukim                   | String   | (nullable)                                |
| lot_no                  | String   |                                           |
| zone_id                 | Int32    | Renamed from zoneId                       |
| zone                    | String   |                                           |
| spoc_code               | String   |                                           |
| cert_no                 | String   |                                           |
| cert_status             | String   |                                           |
| track_id                | String   |                                           |
| record_status           | String   |                                           |
| deleted                 | Boolean  |                                           |
| created_at              | Datetime | Parsed from ISO 8601                      |
| updated_at              | Datetime | Parsed from ISO 8601                      |
+-------------------------+----------+-------------------------------------------+

GCS Storage

gs://calee_data/raw/mspo/api/smallholders/spoc/list-public/
└── YYYYMMDD/
    └── list-public.parquet    # ~30MB compressed (zstd)

Example Code

Python (requests + polars)

import requests
import polars as pl
from datetime import datetime

def fetch_spoc_public():
    """Fetch all SPOC smallholders from public endpoint."""
    url = "https://api.mspots.org.my/api/spoc/get-spoc-list-public"

    # Note: pageSize is ignored, returns all ~306k records
    response = requests.get(url, params={"page": 1, "pageSize": 10}, verify=False)
    response.raise_for_status()

    data = response.json()["data"]["results"]
    print(f"Fetched {len(data):,} records")

    return data

def to_dataframe(data: list) -> pl.DataFrame:
    """Convert JSON to typed Polars DataFrame."""
    df = pl.DataFrame(data)

    # Cast numeric columns
    df = df.with_columns([
        pl.col("planted_area").cast(pl.Float64),
        pl.col("certified_area").cast(pl.Float64),
        pl.col("gps_coordinate_lat").cast(pl.Float64),
        pl.col("gps_coordinate_long").cast(pl.Float64),
    ])

    # Rename camelCase to snake_case
    df = df.rename({
        "stateId": "state_id",
        "daerahId": "daerah_id",
        "mukimId": "mukim_id",
        "zoneId": "zone_id",
    })

    return df

def save_to_gcs(df: pl.DataFrame):
    """Save DataFrame to GCS as Parquet."""
    from google.cloud import storage
    import io

    date_str = datetime.now().strftime("%Y%m%d")
    path = f"raw/mspo/api/spoc/get-spoc-list-public/{date_str}/spoc-list-public.parquet"

    buffer = io.BytesIO()
    df.write_parquet(buffer, compression="zstd")

    client = storage.Client()
    bucket = client.bucket("calee_data")
    blob = bucket.blob(path)
    blob.upload_from_file(io.BytesIO(buffer.getvalue()), content_type="application/octet-stream")

    print(f"Saved to gs://calee_data/{path}")
    return path

# Usage
data = fetch_spoc_public()
df = to_dataframe(data)
save_to_gcs(df)

cURL

# Fetch all SPOC smallholders (returns ~80MB JSON)
curl -sk "https://api.mspots.org.my/api/spoc/get-spoc-list-public?page=1&pageSize=10" \
  -H "Accept: application/json" \
  -o spoc-list-public.json

# Check record count
jq '.data.results | length' spoc-list-public.json

DuckDB Query

-- Query from GCS
SELECT
    state,
    COUNT(*) as smallholders,
    SUM(certified_area) as total_ha,
    AVG(certified_area) as avg_ha
FROM read_parquet('gs://calee_data/raw/mspo/api/spoc/get-spoc-list-public/*/*.parquet')
WHERE cert_status = 'ACTIVE'
  AND deleted = false
GROUP BY state
ORDER BY total_ha DESC;

Data Quality Notes

  1. GPS Coordinates: Some records have null or invalid coordinates
  2. District/Mukim: Many records have null daerahId/mukimId
  3. Duplicate MPOB Numbers: A single MPOB number can have multiple plots (different lot_no)
  4. Area Precision: Areas are stored as strings with 4 decimal places
  5. Deleted Flag: Filter on deleted = false for active records

Comparison: Public vs Internal Endpoint

Aspect Public (get-spoc-list-public) Internal (get-spoc-list)
Authentication None Session required
Pagination Returns all at once (~306k) Paginated (1000/page)
Response Size ~80MB JSON ~3MB per page
Extra Fields record_status, deleted None
Use Case Bulk export, analytics Real-time queries