Optimizing the Crude Oil Supply Chain in Canada: A Mathematical and Data-Driven Approach 

The Canadian crude oil supply chain faces increasingly complex constraints. Pipeline congestion, apportionment, rail bottlenecks, storage limitations, and fluctuating price differentials collectively erode the netback profitability. In this article, we present a mathematically formulated, data-integrated approach for optimizing the upstream-to-downstream movement of crude oil, enabling producers to make data-informed, profit-maximizing, logistical, and commercial decisions.

Want to explore how this model can be tailored to your operations?

Schedule a free consultation.

Background

Most Canadian crude oil is landlocked with limited access to global markets. While pipelines remain the dominant mode of transportation, they frequently operate at or near capacity, leading to apportionment and substantial volatility in realized netbacks. Rail offers a more flexible alternative, but at a higher cost and with operational challenges such as scheduling constraints and turnaround times. In this complex environment, producers must continually manage contractual nominations, storage utilization, and pricing differentials in real time.
A centralized optimization model can be instrumental in effectively addressing these challenges. By integrating data from operational systems, market signals, and infrastructure availability, the model provides prescriptive recommendations that guide tactical actions and strategic planning. The key decision areas include routing, batching, storage management, and nomination scheduling. The primary objectives of the optimization framework are as follows:
  • Maximizing Netback Revenue: Select the most economically advantageous transportation and delivery routes. 
  • Dynamic Volume Allocation: Adjust crude volumes across pipelines, rail, and storage dynamically, based on capacity limits, price spreads, timing considerations, and operational constraints. 
  • Operational Constraint Modeling: Incorporate real-world logistics such as apportionment rules, batch sequencing, and blending limitations into the decision logic.  

Mathematical Framework 

This supply chain optimization problem can be formulated as a multiperiod mixed-integer linear program (MILP). In this case, the core objective is to maximize netback revenue across all upstream production sites 𝑖, destination markets 𝑗, and time periods 𝑡. Netback is defined as the realized price at each destination minus the associated costs, including the transportation, production, storage, and pricing differentials.

Where:

Given that the primary goal is to maximize netback revenue for producers, the objective function can be formulated as follows:

Where:

Subject to: 

1. Transportation capacity constraints
2. Apportionment rules:

Where:

3. Storage balance:

Where:

4. Batch and blending rules (binary constraints where necessary)

Where:

The binary restriction xi, j, t, k∈{0, 1}  is applied when only one batch type can be selected or dispatched per route in each period, such as when operational rules require discrete batching decisions. For scenarios involving continuous flow volumes, the variable can be relaxed to xi, j, t, k∈[0, U] , where U is the upper bound defined by capacity or contractual limits. Logical batching constraints such as mutual exclusivity or minimum run sizes can then be handled separately within the model.
Additional operational, regulatory, or economic factors not explicitly represented here (e.g., quality specifications, downstream blending compatibility, or emission constraints) can also be incorporated into the optimization framework as required. 

Local Guardrails as Innovation Labs

Rather than being seen as a threat to national cohesion, regional efforts should be treated as test beds for ethical AI. Local policies can reveal what works and what doesn’t—generating valuable insights for broader regulation.

For example:

  • a municipality that introduces community review panels for AI-driven service decisions may uncover lessons about consent, fairness, and explainability that are directly applicable at the national level
  • a provincial agency that mandates documentation of algorithmic design choices may uncover effective templates for transparency requirements

These experiments aren’t just useful; they’re vital. 

ThoughtsWin advocates for precisely this kind of thoughtful local intervention. We believe that scalable, flexible governance, especially of data, can empower regional stakeholders to act with confidence—protecting their communities while also contributing to the national conversation.

Need help modeling your supply chain mathematically?

Schedule a free consultation.

Key Optimization Features

The proposed optimization framework can serve as a comprehensive decision support system that incorporates real-world market signals, infrastructure constraints, and operational rules. It is purpose-built to reflect the economic trade-offs and logistical challenges unique to Canada’s crude-oil producers. The core intelligence of the model is reflected in the following key capabilities.

Effective governance frameworks help organizations:

  • Define ownership and accountability across the data lifecycle
  • Monitor data quality and identify risks early
  • Establish role-based permissions and access controls
  • Develop and communicate ethical AI policies
  • Build feedback loops into AI workflows for ongoing oversight
Critically, this kind of governance isn’t just about technology—it’s about people, culture and process. Governance frameworks must reflect the values of the communities they serve. That means grounding decision-making in transparency, equity, and meaningful stakeholder engagement.

Operationalizing Ethics: How We Help

At ThoughtsWin, we help local and regional organizations build AI governance models that are actionable, adaptable, and aligned to real-world needs. Our work spans public and private sectors—from municipalities seeking intelligent infrastructure to agencies digitizing legacy decision-making processes.
  • Transportation Mode Optimization: Dynamically selects between pipeline and rail based on tariffs, WCS-WTI price differentials, and real-time infrastructure availability to maximize the netback value. 
  • Storage strategy optimization: Leverage injection and withdrawal scheduling to buffer volumes during unfavorable market conditions such as negative price spreads, pipeline outages, or railcar bottlenecks. 
  • Apportionment and equity handling: Model nomination acceptance, historical rejection patterns, and equitable volume allocation using rule-based logic informed by verified apportionment data. 
  • Scenario-Based Simulation: Enables a “what-if” analysis to evaluate the effects of infrastructure changes (e.g., pipeline reversals), diluent strategies, regulatory shifts, or carbon pricing policies. 

Data Engineering and Architecture

To ensure robust optimization performance, Databricks Lakehouse architecture can be implemented with Delta Lake storage. Built on the Medallion Architecture, this framework organizes data into Bronze (raw), Silver (cleaned), and Gold (curated) layers to ensure traceability, scalability, and high-quality decision-making. Databricks pipelines are designed for both real-time streaming and batch processing, depending on the nature and urgency of the input.
Data quality is embedded throughout the pipeline. Built-in validation rules, freshness checks, and schema enforcement help maintain reliability. Delta Lake’s versioning, lineage, and ACID properties support auditing and rollback. Together, these controls ensure that the optimization decisions are based on clean, current, and trustworthy data.

Bronze Layer – Raw Ingestion

The Bronze layer ingests data in its raw, unprocessed form, preserving fidelity, and enabling traceability. Real-time data, such as market prices and SCADA telemetry, are captured via streaming APIs or IoT streams, whereas nomination data, rail logs, and regulatory updates arrive through scheduled batch feeds.
  • Real-time: WCS/WTI prices, pipeline sensor data, railcar tracking 
  • Batch: Nominations, apportionment records, storage reports, government policy data
This layer prioritizes ingestion completeness and metadata tagging over structure or quality.

Silver Layer – Cleansing and Standardization

In the Silver layer, the data is validated, standardized, and aligned across sources. This includes unit conversion, timestamp harmonization, anomaly detection, and cross-system joins. Quality checks, such as outlier filtering and referential integrity, are applied to ensure consistency before downstream modeling.
  • Aligns pipeline and rail volumes.
  • Validates price and production data.
  • Enforces business rules and data integrity.
  • Uses a mix of batch and microbatch updates.
The Silver layer acts as the foundation for accurate optimization of the inputs.

Gold Layer – Curated and Trusted Outputs

The Gold layer delivers structured, business-ready datasets that are fed directly into optimization models, dashboards, and simulations. These include netback calculations, capacity-adjusted transport options, and storage strategies for various scenarios.
  • Curated netback metrics by site and market 
  • Optimized routing and batching profiles 
  • Scenario data for infrastructure, pricing, or policy shifts
The data here are fully validated, refreshed in real-time or batch, and easily consumed across business teams through Databricks notebooks and Power BI dashboards. 

Data Engineering Workflow 

Curious how your infrastructure supports optimization?

Schedule a free consultation.

Data Engineering Workflow 

The following data inputs are required to enable actionable optimization.
  • Production Forecasts: Daily/monthly output by crude type (e.g., diluted bitumen, synthetic crude).
  • Pipeline Data: Capacity, tolls, nomination windows, apportionment history, and batching policies.
  • Rail Data: Tariffs, car availability, turnaround times, and origin–destination pairings.
  • Storage Data: Available capacity, current fill levels, injection/withdrawal limits.
  • Pricing Data: WCS, WTI, Brent benchmarks, price differentials, tariffs, and net back values.
  • Refinery Demand: Contract volumes, quality specifications, and planned outages.
  • Verification and Nomination Data: Historical nomination behavior, verification procedures, and apportionment results.
  • Market and Regulatory Data: Government curtailments, pipeline approval timelines, and royalty frameworks.

From Forecast to Action: Real-Time Optimization & Scenarios

To implement the optimization engine, we begin by building a mixed-integer linear programming (MILP) model in Python using Pyomo or PuLP. It can be configured to interface with Gurobi for high-performance solving, with optional fallback to open-source solvers such as the COIN-OR Branch and Cut (CBC) for cost-sensitive environments. Machine learning forecasts, developed in scikit-learn and trained on historical data, can be integrated to account for uncertainty in factors such as WCS-WTI spreads and refinery demand.
Next, we establish a dynamic re-optimization workflow that ingests real-time inputs such as apportionment levels, pricing differentials, and refinery offtake forecasts. These inputs trigger automated model runs that update the recommendations for the transport mode, storage allocation, and nomination schedules. Optimizations can be scheduled (hourly or daily) or triggered on demand using a callable API.
To support the scenario analysis, we enable parameterized inputs that reflect infrastructure changes, regulatory shifts, or new blending strategies. Scenarios can be triggered via an API or scheduled workflows with outputs, including dispatch volumes, netback forecasts, and contingency routing, sent to dashboards or downstream systems. 
For long-term adaptability, we incorporate a reinforcement learning (RL) module that uses past apportionment and netback outcomes to refine nomination strategies. Beginning with a pilot in which the RL agent is trained using feedback from historical data, strategy accuracy gradually improves as market conditions evolve. 

Turning Models into Decisions: Dashboards & Deployment

To ensure seamless interaction across technical and commercial teams, a deployment architecture can be built to emphasize modularity, scalability, and role-based access. The core optimization engine is exposed via RESTful APIs that are connected to a version-controlled data backend, such as PostgreSQL or Snowflake, thereby enabling reliable input/output management.
On the front end, interactive dashboards—developed using tools such as Dash, Streamlit, or Power BI—provide planners and analysts with real-time access to optimization outcomes. These dashboards allow users to explore scenario results, interpret model recommendations, and make data-informed decisions. The key decision support metrics to be tracked are as follows:
  • Netback revenue by market and production site 
  • Recommended route allocations 
  • Volume forecasts adjusted for apportionment. 
  • Optimal transport mode by location and period 
  • Comparative netback improvements over baseline strategies 
  • Storage fill and withdrawal trajectories 
  • Pipeline and rail utilization heatmaps 

Operational Deployment Architecture

For enterprise-level visibility, reporting outputs can also be integrated with Power BI or Tableau to consolidate performance tracking and strategic insights across teams. For users such as dispatch managers and commercial planners, the system can be built to support operational overrides (e.g., enforcing rail transport owing to maintenance) and the ability to initiate ad hoc optimization runs via API or dashboard triggers. 
All model runs are logged and versioned using MLflow (whenever possible) to ensure full auditability and reproducibility. Final outputs, such as daily nomination schedules, storage forecasts, and strategy comparisons, can be exported as follows: 
  • Structured CSV files for internal reporting 
  • API payloads for direct integration with ERP systems and downstream applications.

Why It Matters: Economic, Strategic, and Operational Wins

The crude oil supply chain optimization system can deliver measurable improvements in profitability, efficiency, and risk management. Producers can improve their net back revenue by dynamically aligning transportation, storage, and marketing strategies with real-time market conditions and infrastructure constraints. The system enables smarter use of blending, rail arbitrage, and storage timing to manage apportionment risk, avoid over-nomination penalties, and optimize delivery economics.
In addition to short-term gains, the platform can support long-term strategic planning, improve transparency, and enhance regulatory readiness. This creates a centralized, version-controlled environment for managing nomination logic, making monthly submissions more consistent and auditable. Key benefits include:
  • Revenue Maximization: Boost realized netbacks by optimizing mode, routing, and timing decisions using real-time pricing and capacity data. 
  • Apportionment Mitigation: Leverage rail and proactive storage to maintain flow continuity during constrained pipeline periods. 
  • Equitable access Modeling: Simulate nomination fairness to support the inclusion of smaller or non-integrated producers. 
  • Infrastructure and Contract Planning: Evaluate “what-if” scenarios, including pipeline reversals, blending strategies, and expansion options. 
  • Regulatory Readiness: Enable traceable, auditable workflows that align with compliance requirements and internal controls. 
  • Operational efficiency: Reduces manual effort and planning time by automating constraint-aware scheduling and nominations. 
  • Governance and Reporting Strength: Centralize decision logic with versioning for transparent, governed nomination processes.

Future Roadmap

The optimization framework has strong potential for expansion into more advanced, resilient, and sustainability-aligned decision-making. Future enhancements could aim at increasing the predictive power, operational agility, and ESG integration of the system. These extensions make the platform more adaptive to uncertainty, respond to real-time events, and align with environmental targets. Planned roadmap initiatives include the following.
  • Stochastic Optimization: Extend the MILP model to incorporate uncertain price paths, demand volatility, and infrastructure outages using stochastic or robust programming techniques. 
  • ESG Integration: Add constraints and objectives related to emissions per barrel, carbon intensity by route, or energy consumption to align with corporate sustainability and regulatory commitments. 
  • Digital Twin Deployment: Integrate real-time SCADA and IoT data from terminals to enable live re-optimization during operational hours, creating a digital twin that mirrors the physical crude supply chain. 
  • Real-Time Scenario Triggers: Enable event-based re-optimization (e.g., pipeline incident or sudden price swing) for more agile and autonomous response mechanisms. 
  • Advanced Forecasting Modules: Incorporate deep learning or hybrid ML models to improve the accuracy of input forecasts such as apportionment likelihood, storage congestion, or demand signals.

Final Takeaway: Transforming Strategy into Competitive Advantage

By integrating mathematical optimization, scalable data engineering, and intuitive deployment, the proposed system transforms the Canadian crude oil supply chain into a dynamic, responsive, and economically optimized operation. It empowers producers with the intelligence and agility required to navigate infrastructure constraints and volatile market conditions effectively. The key outcomes of the system are as follows:
  • Mathematical Optimization: Enables data-driven decision-making across routing, storage, and nomination strategies using MILP and scenario-based modeling. 
  • Scalable Data Infrastructure: Ensures fast, reliable, and high-quality data flow through a cloud-native architecture built on Databricks, Delta Lake, and PySpark. 
  • User-centric Deployment: Provides flexible access to optimization results through REST APIs, interactive dashboards, and enterprise reporting tools. 
  • Operational Agility: Supports real-time re-optimization and strategic forecasting to adapt quickly to market shocks or infrastructure changes. 
  • Economic Value: Delivers measurable improvements in netbacks, cost efficiency, and risk mitigation across the supply chain.
ThoughtWins (TWS) brings this vision to life. With deep expertise in energy analytics, optimization modeling, and cloud-native architecture, TWSs help upstream and midstream operators implement this solution at scale, from building custom MILP models and data pipelines to deploying real-time dashboards that deliver immediate insights.

Ready to unlock better netbacks and operational agility?

Contact us today to start your optimization journey.

References

  1. Canada Energy Regulator (CER). (2021). Crude Oil Pipeline Transportation System. Retrieved from https://www.cer-rec.gc.ca/en/data-analysis/facilities-we-regulate/canadas-pipeline-system/2021/crude-oil-pipeline-transportation-system.html
  2. Canadian Association of Petroleum Producers (CAPP). (2023). Canadian and U.S. Crude Oil Pipelines and Refineries Map. Retrieved from https://www.capp.ca/wp-content/uploads/2024/03/CAPP-2023-Pipeline-and-Refinery-Map-36×22-wall-size.pdf
  3. Canada Energy Regulator. (2023, August 22). Crude oil pipeline transportation system. In Canada’s pipeline system 2021. Retrieved June 25, 2025, from https://www.cer-rec.gc.ca/en/data-analysis/facilities-we-regulate/canadas-pipeline-system/2021/crude-oil-pipeline-transportation-system.html

Take the Next Step with ThoughtsWin Systems

For more information on our services and how we can assist you, reach out to arjun.banga@thoughtswinsystems.com . Let’s revolutionize your data management process together.