Independent Smallholders (SPOC)

Overview
| Property |
Value |
| Endpoint |
GET https://api.mspots.org.my/api/spoc/get-spoc-list-public |
| Portal URL |
https://emspo.org.my/public/spoc |
| Total Records |
~306,000 |
| Update Frequency |
Daily |
| Authentication |
None (public) |
Parameters
| Parameter |
Type |
Required |
Default |
Description |
page |
integer |
No |
1 |
Page number (ignored - returns all) |
pageSize |
integer |
No |
10 |
Records per page (ignored - returns all) |
Note: Despite accepting pagination parameters, this endpoint returns ALL records (~306k) in a single response regardless of pageSize.
Response Structure
{
"data": {
"page_size": "10",
"results": [
{
"entity_smallholder_id": 73042,
"entity_name": "JELI/ GUA MUSANG/ KRAI/ MACHANG",
"mpob_no": "832452001098",
"planted_area": "2.4300",
"certified_area": "2.4300",
"gps_coordinate_lat": "5.551060",
"gps_coordinate_long": "101.920321",
"geolocation": "5.551060, 101.920321",
"name": "ROSNANI BINTI IBRAHIM",
"stateId": 3,
"state": "Kelantan",
"daerahId": null,
"daerah": null,
"mukimId": null,
"mukim": null,
"lot_no": "14165",
"cert_status": "DELAYED",
"zoneId": 11,
"zone": "Zon Timur 2",
"cert_no": "MSPO-492/24",
"spoc_code": "D1",
"created_at": "2025-06-03T04:46:41.581Z",
"updated_at": "2024-07-24T01:17:06.101Z",
"track_id": "OPMC205946",
"record_status": "NO CHANGE",
"deleted": false
}
]
}
}
Response Fields
| Field |
Type |
Description |
entity_smallholder_id |
integer |
Unique smallholder plot ID |
entity_name |
string |
Cluster/entity name (often district list) |
mpob_no |
string |
MPOB registration number (12 digits) |
planted_area |
string |
Total planted area in hectares |
certified_area |
string |
MSPO certified area in hectares |
gps_coordinate_lat |
string |
GPS latitude |
gps_coordinate_long |
string |
GPS longitude |
geolocation |
string |
Combined "lat, long" string |
name |
string |
Smallholder name |
stateId |
integer |
Malaysian state ID |
state |
string |
Malaysian state name |
daerahId |
integer/null |
District ID |
daerah |
string/null |
District name |
mukimId |
integer/null |
Mukim ID |
mukim |
string/null |
Mukim/subdistrict name |
lot_no |
string |
Land lot number |
cert_status |
string |
Certification status |
zoneId |
integer |
MSPO zone ID (1-11) |
zone |
string |
MSPO zone name |
cert_no |
string |
MSPO certificate number |
spoc_code |
string |
SPOC cluster code (e.g., "A1", "D1") |
created_at |
string |
Record creation timestamp (ISO 8601) |
updated_at |
string |
Last update timestamp (ISO 8601) |
track_id |
string |
Tracking ID (e.g., "OPMC205946") |
record_status |
string |
Change status: "NO CHANGE" or "UPDATED" |
deleted |
boolean |
Soft delete flag |
Certification Status Values
| Status |
Description |
ACTIVE |
Currently certified |
DELAYED |
Certification delayed/pending |
EXPIRED |
Certification expired |
SUSPENDED |
Certification suspended |
MSPO Zones
| Zone ID |
Zone Name |
States Covered |
| 1 |
Zon Utara 1 |
Perlis, Kedah (partial), Perak (partial) |
| 2 |
Zon Utara 2 |
Penang, Kedah (partial), Perak (partial) |
| 3 |
Zon Tengah 1 |
Selangor, Kuala Lumpur |
| 4 |
Zon Tengah 2 |
Negeri Sembilan, Melaka |
| 5 |
Zon Selatan 1 |
Johor (partial) |
| 6 |
Zon Selatan 2 |
Johor (partial) |
| 7 |
Zon Timur 1 |
Pahang |
| 8 |
Zon Timur 2 |
Terengganu, Kelantan |
| 9 |
Zon Sabah 1 |
Sabah (partial) |
| 10 |
Zon Sabah 2 |
Sabah (partial) |
| 11 |
Zon Sarawak |
Sarawak |
Raw Response → Tidy Data
Raw JSON (nested)
data.results[] → array of smallholder objects
Tidy Parquet Schema
+-------------------------+----------+-------------------------------------------+
| Column | Type | Notes |
+-------------------------+----------+-------------------------------------------+
| entity_smallholder_id | Int64 | Primary key |
| entity_name | String | |
| mpob_no | String | 12-digit MPOB number |
| name | String | Smallholder name |
| planted_area | Float64 | Cast from string |
| certified_area | Float64 | Cast from string |
| gps_coordinate_lat | Float64 | Cast from string |
| gps_coordinate_long | Float64 | Cast from string |
| state_id | Int32 | Renamed from stateId |
| state | String | |
| daerah_id | Int32 | Renamed from daerahId (nullable) |
| daerah | String | (nullable) |
| mukim_id | Int32 | Renamed from mukimId (nullable) |
| mukim | String | (nullable) |
| lot_no | String | |
| zone_id | Int32 | Renamed from zoneId |
| zone | String | |
| spoc_code | String | |
| cert_no | String | |
| cert_status | String | |
| track_id | String | |
| record_status | String | |
| deleted | Boolean | |
| created_at | Datetime | Parsed from ISO 8601 |
| updated_at | Datetime | Parsed from ISO 8601 |
+-------------------------+----------+-------------------------------------------+
GCS Storage
gs://calee_data/raw/mspo/api/smallholders/spoc/list-public/
└── YYYYMMDD/
└── list-public.parquet # ~30MB compressed (zstd)
Example Code
Python (requests + polars)
import requests
import polars as pl
from datetime import datetime
def fetch_spoc_public():
"""Fetch all SPOC smallholders from public endpoint."""
url = "https://api.mspots.org.my/api/spoc/get-spoc-list-public"
# Note: pageSize is ignored, returns all ~306k records
response = requests.get(url, params={"page": 1, "pageSize": 10}, verify=False)
response.raise_for_status()
data = response.json()["data"]["results"]
print(f"Fetched {len(data):,} records")
return data
def to_dataframe(data: list) -> pl.DataFrame:
"""Convert JSON to typed Polars DataFrame."""
df = pl.DataFrame(data)
# Cast numeric columns
df = df.with_columns([
pl.col("planted_area").cast(pl.Float64),
pl.col("certified_area").cast(pl.Float64),
pl.col("gps_coordinate_lat").cast(pl.Float64),
pl.col("gps_coordinate_long").cast(pl.Float64),
])
# Rename camelCase to snake_case
df = df.rename({
"stateId": "state_id",
"daerahId": "daerah_id",
"mukimId": "mukim_id",
"zoneId": "zone_id",
})
return df
def save_to_gcs(df: pl.DataFrame):
"""Save DataFrame to GCS as Parquet."""
from google.cloud import storage
import io
date_str = datetime.now().strftime("%Y%m%d")
path = f"raw/mspo/api/spoc/get-spoc-list-public/{date_str}/spoc-list-public.parquet"
buffer = io.BytesIO()
df.write_parquet(buffer, compression="zstd")
client = storage.Client()
bucket = client.bucket("calee_data")
blob = bucket.blob(path)
blob.upload_from_file(io.BytesIO(buffer.getvalue()), content_type="application/octet-stream")
print(f"Saved to gs://calee_data/{path}")
return path
# Usage
data = fetch_spoc_public()
df = to_dataframe(data)
save_to_gcs(df)
cURL
# Fetch all SPOC smallholders (returns ~80MB JSON)
curl -sk "https://api.mspots.org.my/api/spoc/get-spoc-list-public?page=1&pageSize=10" \
-H "Accept: application/json" \
-o spoc-list-public.json
# Check record count
jq '.data.results | length' spoc-list-public.json
DuckDB Query
-- Query from GCS
SELECT
state,
COUNT(*) as smallholders,
SUM(certified_area) as total_ha,
AVG(certified_area) as avg_ha
FROM read_parquet('gs://calee_data/raw/mspo/api/spoc/get-spoc-list-public/*/*.parquet')
WHERE cert_status = 'ACTIVE'
AND deleted = false
GROUP BY state
ORDER BY total_ha DESC;
Data Quality Notes
- GPS Coordinates: Some records have null or invalid coordinates
- District/Mukim: Many records have null
daerahId/mukimId
- Duplicate MPOB Numbers: A single MPOB number can have multiple plots (different lot_no)
- Area Precision: Areas are stored as strings with 4 decimal places
- Deleted Flag: Filter on
deleted = false for active records
Comparison: Public vs Internal Endpoint
| Aspect |
Public (get-spoc-list-public) |
Internal (get-spoc-list) |
| Authentication |
None |
Session required |
| Pagination |
Returns all at once (~306k) |
Paginated (1000/page) |
| Response Size |
~80MB JSON |
~3MB per page |
| Extra Fields |
record_status, deleted |
None |
| Use Case |
Bulk export, analytics |
Real-time queries |