Opportunities ( & some limitations ) of Generative AI in IBP

September 25, 2024 by Fizi Yadav

Generative AI is currently in an incubation phase, with research progressing at a rapid pace as new models are constantly emerging. However, the industry is still determining the most effective use cases for these models to drive meaningful change. This change can take various forms, such as increased productivity, unlocking new value propositions or market opportunities, streamlining automation, and enhancing decision-making and creativity.

IBP presents a significant opportunity for generative AI, particularly due to the vast amounts of data spread across various functions and the need to simulate scenarios and quickly find relevant answers. Based on experience with numerous generative AI cases, the following areas offer the greatest potential for uplift:

Providing Data Analysis in Natural Language

As generative AI models become more reliable and less prone to hallucinations, coupled with tighter control over data organization and access, a mechanism can be provided for business users to ask first-level analysis questions and receive answers quickly. Previously, this would require specialized SQL knowledge or reliance on data domain experts. By eliminating that dependency, faster responses and decision-making are enabled. However, it is crucial to implement sufficient guardrails to ensure the accuracy of the analysis. This can be achieved through several mechanisms:

Creating an Effective and Explainable Entity-Relationship Model for Storing Data: The fewer decision blocks a model goes through, the lower the risk of errors. Structuring tables coherently for easier analysis leads to better results. Generative AI should prioritize understanding the context of questions and retrieving relevant data from tables, rather than handling complex calculations.. As a guiding principle, if retrieving the response requires a Common Table Expression (CTE), it’s often better to create the response as a view.
Constraining the Agent: It is essential to impose restrictions on how and what data an AI agent can access. Using a SQL agent that fetches data from tables with built-in row and column access controls is more prudent than employing a Python agent that reads flat files. Granular data controls help secure information, reducing the risk of breaches or misuse of agents.
Validation Checks on the Response: Quality controls should be applied to the response itself. These can either be rules-based or encoded as contextual instructions for the AI agent, ensuring it validates the response before providing an answer.
User Training: It is important not to rely solely on the technology. Upskilling users and providing them with foundational knowledge of generative AI, including its limitations, can help them better navigate the augmented processes.

Creating Human-Readable Reports from Generated Analysis

Foundational models excel at natural language processing and generation, making them capable of transforming data-heavy reports into easily readable formats with the right context and guidance. This is valuable for generating insights and providing context to users who might otherwise require domain knowledge to interpret the information.

Classification and Categorization of Semi-Structured and Unstructured Data

This is a highly impactful use case for generative AI, significantly boosting productivity where traditional rules for categorization are loosely defined. Conventional rules-based or machine learning classifiers rely on structured training datasets, making them less effective with data that has fluid boundaries or is constantly evolving. Generative AI offers superior flexibility in interpreting such changes and is also easier to set up and adapt. In most cases, especially with non-specialized datasets, there’s no need for frequent model retraining.

Automation of Processes and Services

Automation is another key area where generative AI excels, and it has been a central focus for many business applications to date. By combining traditional coding techniques with AI agents, frameworks can be created to automate business processes that are often managed by SaaS models. Some examples include:

On-Demand Infrastructure Provisioning: AI agents can be designed to handle user requests for provisioning VMs, cloud services, or user accounts. These agents have access to service accounts and the necessary permissions to interpret user requests, verify eligibility, and complete the action.
User Onboarding: Early adoption of AI has been seen in user onboarding through chatbots that assist with process queries. There are also innovations, like Google’s tool that converts scientific papers into podcast formats. Similar audio-visual formats can be used to guide users through processes.
Auto-Scheduling of Product Lines: Generative AI can be embedded in workflows to eliminate manual processes and reduce friction. For example, if a production line fails, a user can request rescheduling. The AI agent can initiate an approval process, and upon approval, invoke an optimizer to rebalance production. Here, the agent acts as a mediator between services, enhancing automation.
Route Management in Logistics: Generative AI can augment GIS data to optimize warehouse locations and reduce shipping costs. By analyzing historical data, demand forecasts, and transportation routes alongside GIS mapping, AI can provide real-time analysis of logistics bottlenecks and their solutions.

However, it is worth pointing out that Gen AI is still in its infancy when it comes to traditional ML domains such as time-series forecasting, optimization and scenario planning and as such should only be used to augment existing models until such a time when foundational models become better at it.

Provided below, are a few examples where we have successfully implemented and industrialized Generative AI-driven processes.

Forecast Narrative

In any planning process, demand forecasting is key and typically spans a defined time period, often updated monthly. While there are various methods for creating forecasts, many businesses are increasingly adopting machine learning-driven forecasts due to their high accuracy and bias reduction. However, these sophisticated models often lack explainability due to their black-box nature, making it challenging for demand planners to justify forecast changes. To alleviate this burden, we developed a Generative AI model that compares two forecasts and constructs a narrative explaining the differences. By structuring the model’s inputs and outputs intelligently and maintaining tight control over data, the AI could access and analyze changes between forecasts, referencing the underlying drivers to provide a plausible explanation. It then generated a complete narrative that planners could use. The key point here is that the Gen AI acts as a support tool, removing roadblocks in the process, although the planner still verifies the findings. This approach reduced the time needed to construct a narrative from a week to just two days.

Analyst Bot

We developed a helper bot designed to assist business users with data queries. This bot uses a SQL agent to translate natural language queries into SQL, execute them on the server, and then present the results in a standardized format to the user. As discussed earlier, the agent first checks the user’s permissions before fetching any data, and it also runs validation checks on the responses. This greatly democratized data analysis, enabling business users to be more productive in their daily work and significantly reducing the volume of ad hoc data requests to technical teams.

Indirect Cost Classifier

Indirect costs, or overhead costs, are expenses that cannot be directly linked to a specific product or service but are essential for the overall operation of a business. These costs span various activities and departments, making them difficult to track. We had an internal taxonomy for categorizing these costs, but previously, an associate had to manually review receipts collected by the client—a highly time-consuming process. By combining OCR with Generative AI, we created a system to automatically read data from receipts, identify the type of expense, and categorize it based on our taxonomy. This automation reduced the effort from days to mere minutes, streamlining the entire process. Knowing where these costs lie empowers businesses to negotiate or eliminate unnecessary expenses, ultimately enabling more effective cost management and improved control over spending.

Harnessing the Power of Data in IBP Transformation

September 23, 2024 by Fizi Yadav

Harnessing the accrued and ever-increasing data continues to be the biggest challenges for companies embarking on IBP transformations and it is also the most important aspect to master for successful execution. Here we delve into the factors that make mastering data so difficult and provide examples and guidance on how to overcome them.

Characteristics

Following are the major characteristics of the data powering IBP.

Accurate: IBP data is used to generating forecasts and aid in decision making and as such it needs to be precise and reliable. There is no ceiling to the damage inaccuracy in data can present.

Up to date: For the same reason as above, data needs to be real-time or near it, reflecting the current state of the business and its operations. This makes the decision-making process more agile and can also reduce communication touchpoints.

Complete: IBP data is essential to central and holistic decision making. Therefore, it needs encompass all relevant aspects of the business. These include but are not limited to financial, operational, supply-chain, procurement, and market-related information.

Granular: Operational data specific to supply chain has strict hierarchies especially as it relates to products or customers. Hence, the relevancy of data for the user can vary in granularity, from high-level summaries to detailed transactional data.

The data transformation must holistically encompass the above-mentioned aspects to truly gain insight from the information and provide tangible benefits in the decision-making process. Provided below are five necessary things an organization must do for the said data transformation.

Preparation

Having relevant data for the right users and processes is crucial precursor to enabling any kind of digital transformation. Most legacy organizations will have multiple systems and data sources. They will also have processes that have been built upon those systems and would often have dependencies on each other for normalizing the data. This often leads to situations where we define a different number based on who is looking. As s:uch, sufficient effort has to be focused on

Disentangling these processes,
Formalizing the source contracts, collection and refresh cadence of data
Building out the relevant infrastructure and data pipelines
Building out the people system and processes to manage the pipelines.
Continuous monitoring and enhancements for robustness of the pipelines. Data will continue to grow and as such it is important that the technology evolves alongside it.

This is perhaps the most time-intensive undertaking among all mentioned. So, it is important to do it in a staggered manner that allows for quicker results while also maintaining cohesion between the different services.

For a large retail client, we upgraded their ERP system, built an enterprise data platform, and developed downstream data layers that provided a more unified view of the organization's data. This enhanced visibility not only improved operational transparency but also led to a significant increase in data-driven use cases. As a result, the client was able to make more informed decisions and optimize various business processes, driving greater efficiency across the organization.

Scoping

The most crucial aspect for reaching a consensus on numbers is to first define clearly the right sources, transformations and units of the data being utilized for a given process. This is an oft-neglected part of a build that quickly comes to the fore when problems in measurement materialize. It is important to formally acknowledge the data scopes through written records and advertise them broadly so that everyone sings from the same hymn sheet.

Following are a few of the potential hazards in data that one may encounter:

Shipments are tracked as actualized (remove returns or unfulfilled orders) or invoiced.
There may be multiple product id standards.
Shipments may have multiple units of measure.
Same data resides in two different sources. Orders are placed in SAP, but actualized shipments are tracked in Kinaxis.
Data categorization happens in multiple systems. Customer orders are recorded in SAP, but the product hierarchy is created in Kinaxis.
Forecasting is done for a subset of customers (say, retail)
Data Is not refreshed at the cadence that forecasting requires.
SMEs have incorrect definitions of data attributes and their usage.

Data Model & Platform Design

Once the scope is finalized, we can commence work on IBP specific data setup. This means taking the enterprise data and deliberately designing the pipelines and storage architecture that would allow for fit-for-purpose actions on the data by downstream IBM tasks and applications.

The data platform is a layer of fabric connecting multiple sources rather than a single component. Since the underlying storage solutions and their number change with advent of technology, you want an abstraction that can protect the application code from the changes. This can be a custom interface or a utility existing for a given language. For eg: python has fsspec, an interface for interacting with different storage solutions.

Data model refers to structure and relationships between the different data attributes that drive IBP. As described above, this data comes from disparate operational and functional sources. So, the relationships between them has to be formed based on what the usage is and where the data is shown.

For a large B2B client, we created an internal planning application that required separating data structures based on two distinct patterns of viewing the data – high granularity charts and graphs that required high level of detail in the data and low granularity ones that showed trends over longer time horizon.

Governance

Data Governance refers to framework or practices that organizations implement to ensure that their data is managed effectively, securely, and in compliance with relevant regulations and policies. When it comes to IBP applications it encompasses the following:

Role Base Access Control (RBAC)

All interactions on the applications are attributed to clearly defined user personas and each persona can be assigned to one or more users. These personas govern the rights attached to a given UI component namely read, edit, delete etc. In this manner we place access controls in place using the persona/role. Usually RBAC is integrated with an organizations IT-maintained directory services – most common of them being Microsoft Active Directory

ABAC

While the personas govern how one interacts with the application, data controls affect what information one sees. A user is assigned a specific set of attributes which act as filters and are passed down from the UI. These are then applied at the source level (database, cache, static files) and the information is then returned back to the application layer. Different applications might implement this differently, but this is usually implemented at the application level itself for off-the-shelf solutions.

Data Lineage

Data lineage refers to the ability to track and comprehend the journey of data starting from its origin and ending at its destination. It involves capturing and documenting information about where the data originated, how it underwent transformations and the various steps it went through as it moved across systems, applications and processes.

Organizations benefit from data lineage as it provides them with an understanding of how data is created, manipulated, and utilized within their systems. It helps address questions such as:

Where did the data originate from?
Who accessed or modified it?
Where was it stored?
How was it used or analyzed?

For one client, we built a custom, automated data tracer from the ground up to track data lineage across various sources, both on-premises and in the cloud. This tracer captured and stored metadata related to the flow of data, tailored to the specific storage types (such as Parquet files or relational databases). This solution allowed us to document the entire journey of the data, making it possible to visualize the flow across systems in different formats and at varying levels of detail. This visibility was critical in ensuring data accuracy, transparency, and consistency throughout the organization's IBP processes.

In conclusion, harnessing data for Integrated Business Planning (IBP) transformation is no small feat. From ensuring data accuracy and granularity to managing multiple sources and aligning stakeholders, the path to success requires thoughtful planning and execution. By addressing these challenges—through processes and technology—organizations can unlock the full potential of their data.

The Role of a Cohesive and Unifying Data Strategy in Integrated Business Planning (IBP)

September 16, 2024 by Fizi Yadav

In today’s fast-paced business environment, a cohesive and unifying data strategy is crucial for the success of Integrated Business Planning (IBP). By integrating multiple operations, IBP offers end-to-end supply chain visibility, seamless cross-functional connectivity, and accelerates key processes such as Sales & Operations Planning (S&OP), Sales & Operations Execution (S&OE), and New Product Development (NPD) cycles, addressing key challenges faced by legacy organizations.

Traditional companies often operate in silos, where each department maintains its own data, benchmarks, and units of measure. This disjointed approach can lead to inefficiencies and missed opportunities, as data from one operation can significantly impact others. For example, promotional activities can dramatically influence demand forecasts, underscoring the necessity of incorporating up-to-date marketing data into demand planning.

Furthermore, as businesses grow and expand into new markets, their operations become increasingly complex, making it difficult to manage without a comprehensive, integrated planning process. This complexity is further compounded by the ever-present risk of supply chain disruptions and the growing demands for sustainability and regulatory compliance. These factors underscore the critical need for a unified, forward-thinking IBP strategy.

To achieve seamless connectivity between different systems, it is essential to conduct a thorough analysis of existing processes and technologies. While this may sound straightforward, implementation can be complex. In this article, we will draw on a previous case study to illustrate the complexities of integrating multiple planning and source data systems, while also highlighting the best practices that have emerged from our work.

A robust IBP system unifies metrics or establishes clear conversion methods between different units of measure, harmonizing data across functions for improved benchmarking and cross-functional analysis. For instance, the sales team may track product movement in cases or pallets, while the manufacturing team monitors production in individual units. Meanwhile, the finance team may focus on the monetary value of goods sold or produced. A standardized conversion method will allow each department to retain its preferred unit of measure but translate all data into a common framework. This harmonization reduces errors, increases visibility, and improves decision-making, as all teams can access consistent, comparable data for more accurate forecasting, budgeting, and operational planning.

IBP also aligns operating cycles for greater efficiency and synchronizes data refresh cadences, ensuring maximum visibility. For instance, supply planners should have access to near real-time updates on shifting demand forecasts, rather than depending on periodic or delayed data. This real-time visibility allows them to respond more quickly to changes, reduce lead times, and optimize inventory levels by adjusting supply plans proactively, ensuring that stock is aligned with actual demand.

Central to this transformation is the creation of a global data repository with functional demarcation, yet accessible across departments as needed. This de-siloed, foundational approach opens the door to new insights and use cases that were previously undiscovered. It also accelerates the development and deployment of such use cases, thus enabling to significantly lower the cost of adoption of new technologies such as Generative AI. The central repository can function as a loosely federated system tailored to specific purposes but combined to provide a holistic view of business operations. Key components include:

Staging Layer: Stores structured and unstructured data (third-party data dumps, incoming analytic feeds, historical datasets).
Big Data Layer: Houses semantic models for all incoming signals (shipment, consumption, inventory).
Relational Layer: Facilitates predictive and optimization modeling (demand, supply, inventory, financial) and captures snapshots of key drivers (promotion, distribution, pricing, third-party coefficients).
In-Memory Layer: Supports real-time reporting and complex datasets (hierarchical current forecast including adjustments, inventory levels).

The industry is now moving towards a consolidated lake house approach, which combines all the layers into a unified system. All data is funneled into a single domain following the medallion architecture: bronze for raw data, silver for cleansed and enriched data, and gold for high-quality, ready-for-use data. This structure improves scalability, ensures real-time insights, and simplifies data management for enhanced decision-making.

No matter the underlying architecture, the central repository acts as a conduit for data flow between all operations, creating a paradigm where data from various functions integrates seamlessly. To ensure the success of this architecture, it is crucial to establish a clear data governance structure, enforce robust security measures, provide role-based access, and continuously monitor data pipelines.

In conclusion, an overarching focus on data strategy is the cornerstone of effective IBP, driving operational efficiency and business growth. By integrating operations, harmonizing data, and ensuring real-time visibility into it, companies can make informed decisions faster and more effectively. Embracing a unified data strategy within IBP not only aligns with current business demands but also positions organizations for a future of sustained success.

Lighting a Spark

September 16, 2024 by Fizi Yadav

Audience : Anyone who wants to set up a spark cluster. \

Pre-requisite: Knowledge of docker, docker-compose, pyspark

Setting up a Spark Standalone Cluster

In order to set this up we require a few ingredients namely - Java, Spark and python for running pyspark applications. We will set up one master node and 3 workers, although you can scale up or down as you see fit. Both the master and worker will come from the same image, just the entrypoint will be different. Your Java installation needs to be 1.8 as that is what Spark 2.4.x runs on. To set up my dockerfile I start of with a base alpine java 8 image. While this is good for standalone, you might prefer to use ubuntu as I have run into some issues (none that cannot be fixed) when trying to set up a yarn cluster using the same image. So in your project directory create a docker file and then add the following lines to it

FROM openjdk:8-alpine

USER root

# wget, tar, bash for getting spark and hadoop
RUN apk --update add wget tar bash

This will first use the alpine image as our base, make root as the user and download the requisite packages for what’s coming next which is downloading the Spark tar and unpacking it in a directory. I am using /spark as my spark folder. I will then add this to my PATH variable and also declare SPARK_HOME as another environment variable that points to /spark. The last step isn’t necessary from what I have seen so far as spark is smart enough to find the path if the variable isn’t declared but it doesn’t hurt.

RUN tar -xzf spark-2.4.5-bin-hadoop2.7.tgz && \
    mv spark-2.4.5-bin-hadoop2.7 /spark && \
    rm spark-2.4.5-bin-hadoop2.7.tgz

# add to PATH
ENV PATH $PATH:/spark/bin:/spark/sbin

ENV SPARK_HOME /spark

Now we are set up with bot Java and Spark on our system. But in order to use PySpark we need to now install python which the following piece of code accomplishes.

# Install components for Python
RUN apk add --no-cache --update \
    git \
    libffi-dev \
    openssl-dev \
    zlib-dev \
    bzip2-dev \
    readline-dev \
    sqlite-dev \
    musl \
    libc6-compat \
    linux-headers \
    build-base \
    procps 

# Set Python version
ARG PYTHON_VERSION='3.7.6'
# Set pyenv home
ARG PYENV_HOME=/root/.pyenv

# Install pyenv, then install python version
RUN git clone --depth 1 https://github.com/pyenv/pyenv.git $PYENV_HOME && \
    rm -rfv $PYENV_HOME/.git

ENV PATH $PYENV_HOME/shims:$PYENV_HOME/bin:$PATH

RUN pyenv install $PYTHON_VERSION
RUN pyenv global $PYTHON_VERSION
RUN pip install --upgrade pip && pyenv rehash

# Clean
RUN rm -rf ~/.cache/pip

What we do in the lines above is to first get all the required packages needed to install python. Git for cloning the python directory, zlib-dev, bzip2-dev are compression libraries required for the install and so on and so forth. I landed on the required list through some trial and error during installation.

The last step in the dockerfile is embedding the shell scripts that will act as an entry-point to the docker containers. The entrypoint is just the command that starts the spark master or worker node as a daemon. So let’s first create another directory called scripts in our project directory and add two shell files under it namely - run_master.sh and run_worker.sh

run_master has the command to start the spark master node

Different ways of Fibonacci generation using Python

August 26, 2015 by Fizi Yadav in Python

One of the most oft-cited coding questions especially in internship interviews is for the Fibonacci sequence. Here i provide different type of ways to generate Fibonacci numbers in Python including a generator

def fib_r(n):
        #print n
        if (n == 0): return(0)
        if (n == 1): return(1)
        return(fib_r(n-1) + fib_r(n-2))
        
    def fibBinet(n):
        phi = (1 + 5**0.5)/2.0
        return int(round((phi**n - (1-phi)**n) / 5**0.5))
        
    def fib_dp(n):
        l = [0,1]
        for i in range(2,n+1):
            l.append(l[i-1]+l[i-2])
        return l[n]
        
    def fib_ultimate(n):
        if n==0: return 0
        a,b = 0,1
        for i in range(n-1):
            a,b = b,a+b
        return b
        
    def fib_gen():
        a, b = 0, 1
        while True:            # First iteration:
            yield a            # yield 0 to start with and then
            a, b = b, a + b    # a will now be 1, and b will also be 1, (0 + 1)

Using Bootstrap pagination to page through divs on same page

August 25, 2015 by Fizi Yadav in RandomCode

I have been playing around with Bootstrap framework and its a boon in terms of layout and providing all the tools to layout your CSS without much work. It provides a simple pagination but most of the tutorials online including Bootstrap documentation focus on the layout of the HTML rather than using it to provide functionality. I am using Bootstrap pagination to peruse contents on the same page which contains multiple divs. An active page is actually an active div which I swap out when moving to next page. So in a way it acts like a carousel (which is what I looked at initially). So without further ado, here's the HTML to setup the pagination:

<div class="pagination-container" >
   <div data-page="1" >
      <p>Content for Div Number 1</p>
   </div>
   <div data-page="2" style="display:none;">
      <p>Content for Div Number 2</p>
   </div>
   <div data-page="3" style="display:none;">
      <p>Content for Div Number 3</p>
   </div>
   <div data-page="4" style="display:none;">
      <p>Content for Div Number 4</p>
   </div>
   <div data-page="5" style="display:none;">
      <p>Content for Div Number 5</p>
   </div>

   <div class="text-center">
   <div class="pagination pagination-centered">
       <ul class="pagination ">
            <li data-page="-" ><a href="#" >&lt;</a></li>
            <li data-page="1"><a href="#" >1</a></li>
            <li data-page="2"><a href="#" >2</a></li>
            <li data-page="3"><a href="#" >3</a></li>
            <li data-page="4"><a href="#" >4</a></li>
            <li data-page="5"><a href="#" >5</a></li>
            <li data-page="+"><a href="#" >&gt;</a></li>
      </ul>
   </div></div></div>

And here's the javascript for swapping out the div. If you wish to utilize the Bootstrap styling then make sure to have the page structured as provided above. The ul class needs to named pagination and it needs to be enclosed in a div with class "text-center" which helps with centering the page controls

<script>
var paginationHandler = function(){
    // store pagination container so we only select it once
    var $paginationContainer = $(".pagination-container"),
        $pagination = $paginationContainer.find('.pagination ul');
    // click event
    $pagination.find("li a").on('click.pageChange',function(e){
        e.preventDefault();
        // get parent li's data-page attribute and current page
    var parentLiPage = $(this).parent('li').data("page"),
    currentPage = parseInt( $(".pagination-container div[data-page]:visible").data('page') ),
    numPages = $paginationContainer.find("div[data-page]").length;
    // make sure they aren't clicking the current page
    if ( parseInt(parentLiPage) !== parseInt(currentPage) ) {
    // hide the current page
    $paginationContainer.find("div[data-page]:visible").hide();
    if ( parentLiPage === '+' ) {
                // next page
        $paginationContainer.find("div[data-page="+( currentPage+1>numPages ? numPages : currentPage+1 )+"]").show();
    } else if ( parentLiPage === '-' ) {
                // previous page
        $paginationContainer.find("div[data-page="+( currentPage-1<1 ? 1 : currentPage-1 )+"]").show();
    } else {
        // specific page
        $paginationContainer.find("div[data-page="+parseInt(parentLiPage)+"]").show();
            }
        }
    });
};
$( document ).ready( paginationHandler );
</script>

Graph class

August 16, 2015 by Fizi Yadav in Data Structure

Here's a simple Graph data structure that builds upon the previous Vertex data structure. The graph maintains an inner dictionary of all the vertices in it and a count of the vertices. There are functions to create vertex and add edges between two vertices that internally call Vertex member functions

Header

#ifndef __graphs__graph__
#define __graphs__graph__

#include 
#include 
#include 
#include "vertex.h"

class Graph {
    std::map> _vertDict;
    int _numVertices;
    
public:
    Graph(){};
    std::vector getVertices();
    void addVertex(char);
    std::shared_ptr getvertex(char);
    void addEdge(char,char,int);
    int getWeight(char,char);
};

#endif /* defined(__graphs__graph__) */

Source

#include "graph.h"
void Graph::addVertex(char id){
    std::shared_ptr pv = std::make_shared(id);
    _vertDict.insert(std::map>::value_type(id, pv));
    _numVertices++;
}

void Graph::addEdge(char id1, char id2, int weight){
    std::map>::iterator it1 = _vertDict.find(id1);
    std::map>::iterator it2 = _vertDict.find(id2);
    if (it1 == _vertDict.end() || it2 == _vertDict.end()) {
        return;
    }else{
        it1->second->addNeighbor(weight, it2->second);
    }
}

std::shared_ptr Graph::getvertex(char id){
    std::map>::iterator it = _vertDict.find(id);
    if (it != _vertDict.end()) {
        return  it->second;
    }else{
        return nullptr;
    }
}

std::vector Graph::getVertices(){
    std::vector ids;
    for (std::map>::iterator iter = _vertDict.begin(); iter != _vertDict.end(); ++iter){
        ids.push_back(iter->first);
    }
    return ids;
}

int Graph::getWeight(char id1, char id2){
    std::map>::iterator it1 = _vertDict.find(id1);
    std::map>::iterator it2 = _vertDict.find(id2);
    if (it1 != _vertDict.end() && it2 != _vertDict.end()) {
        return it1->second->getWeight(*it2->second);
    }else{
        return -1;
    }
}

Test

#include 
#include "vertex.h"
#include "graph.h"

int main(int argc, const char * argv[]) {    
    Graph g;
    g.addVertex('a');
    g.addVertex('b');
    g.addVertex('c');
    g.addEdge('a', 'b', 9);
    g.addEdge('a', 'c', 5);
    for (auto i: g.getVertices())
        std::cout << i <<'\n';
    std::shared_ptr v1;
    v1 = g.getvertex('a');
    for (auto i: v1->getConnections())
        std::cout << i.second << ':'<< i.first <<'\n';
    std::cout<< g.getWeight('a', 'b')<

Simple Vertex Class

August 14, 2015 by Fizi Yadav in Data Structure

I am beginning to get back into C++ in my downtime. This is a simple Vertex class that has two members: an id, and adjacency list. The list is a multi map that contains references to the Vertex's neighbors and the weight attached to the path from the Vertex to its neighbors. The references are shared pointers since I didn't wish to deal with memory management.

Vertex header class

#ifndef __graphs__vertex__
#define __graphs__vertex__

#include <stdio.h>
#include <map>
#include <vector>

class Vertex {
    char _id;
    std::multimap<int, std::shared_ptr<Vertex>> _adjList;
    
public:
    void addNeighbor(int weight,std::shared_ptr<Vertex> neighbor);
    std::vector<std::pair<int,char>> getConnections();
    char getId();
    int getWeight(Vertex);
    Vertex(char);
    friend bool operator== (Vertex & lhs, Vertex & rhs );
    
};
#endif /* defined(__graphs__vertex__) */

Vertex definition class

#include "vertex.h"

Vertex::Vertex(char id){
    _id = id;
}

void Vertex::addNeighbor(int weight,std::shared_ptr<Vertex> neighbor){
    _adjList.insert(std::multimap<int, std::shared_ptr<Vertex>>::value_type(weight, neighbor));
}

std::vector<std::pair<int,char>> Vertex::getConnections(){
    std::vector<std::pair<int,char>> ids;
    for(std::multimap<int, std::shared_ptr<Vertex>>::iterator it = _adjList.begin(); it != _adjList.end(); ++it) {
        ids.push_back(std::make_pair(it->first, it->second->_id));
    };
    return ids;
}

char Vertex::getId(){
    return _id;
}

int Vertex::getWeight(Vertex v){
    for(std::multimap<int, std::shared_ptr<Vertex>>::iterator it = _adjList.begin(); it != _adjList.end(); ++it) {
        if (*it->second==v) {
            return it->first;
        };
    };
    return -1;
}

bool operator== (Vertex & lhs, Vertex & rhs ){
    return (lhs.getId() == rhs.getId()) ;
}

Main file for example

#include <iostream>
#include "vertex.h"

int main(int argc, const char * argv[]) {
    Vertex v1('a');
    v1.addNeighbor(9, std::make_shared<Vertex>('b'));
    v1.addNeighbor(6, std::make_shared<Vertex>('c'));
    for (auto i: v1.getConnections())
        std::cout << i.second << ':'<< i.first <<'\n';
    return 0;
}

Calculating BS European Call & Put prices

February 28, 2015 by Fizi Yadav in Quant

A simple C++ script to calculate Black-Scholes European Call and Put prices.
The inputs are Option price, Risk-free rate (Percentages must be provided as decimal), Volatility of the underlying (Percentages must be provided as decimal), Time of maturity (Must be in years), Start time.
Output is the calculated Call or Put price

#include "stdafx.h"

namespace BS{

    double getCDF(double x){

        double b0 = 0.2316419;
        double b1 = 0.319381530;
        double b2 = -0.356563782;
        double b3 = 1.781477937;
        double b4 = -1.821255978;
        double b5 = 1.330274429;
        double t = 1 / (1 + b0*x);

        double prod = b1*t + b2*pow(t, 2) + b3*pow(t, 3) + b4*pow(t, 4) + b5*pow(t, 5);

        if (x >= 0.0) {
            return (1 - stdnorm(x)*prod);
        }
        else {
            return 1.0 - getCDF(-x);
        }

    }

    double stdnorm(double x) {

        return (1.0 / (pow(2 * M_PI, 0.5)))*exp(-0.5*x*x);
    }


    BSEur::BSEur() :
        r(0.1), v(0.25), K(100), T(1.0), S(100), isCall(true){}

    BSEur::BSEur(double r, double v, double K, double T, double S, bool isCall){

        this->r = r;
        this->v = v;
        this->K = K;
        this->T = T;
        this->S = S;
        this->isCall = isCall;
    }

    BSEur::~BSEur(){}

    double BSEur::getd(){
        return (log(S / K) + (r + pow(v, 2)*0.5)*(T)) / (v*sqrt(T));
    }

    double BSEur::getCallPrice(){
        double d1 = getd();
        double d2 = d1 - v*sqrt(T);

        /*printf("d1: %lf\n", d1);
		printf("d2: %lf\n", d2);*/

        return (S*getCDF(d1)) - (K*exp(-r*(T))*getCDF(d2));
    }

    double BSEur::getPutPrice(){
        double d1 = getd();
        double d2 = d1 - v*sqrt(T);

        return (K*exp(-r*(T))*getCDF(-d2)) - (S*getCDF(d1));
    }


    void BSEur::printPrice(){
        if (isCall)
        {
            cout << "Price of option: " << getCallPrice()<<endl;
        }
        else
        {
            cout << "Price of option: " <<  getPutPrice()<<endl;
        }
    }
}

Calculating Greeks in C

February 28, 2015 by Fizi Yadav

In quantitative finance, the Greeks are the quantities representing the sensitivity of the price of derivatives such as options to a change in underlying parameters on which the value of an instrument or portfolio of financial instruments is dependent. The name is used because the most common of these sensitivities are often denoted by Greek letters. There are plenty of online resources for understanding the underlying mathematical formulas but here's a quick script

//
//  main.c
//  stat598a1
//
//  Created by Fizi Yadav on 1/20/14.
//  Copyright (c) 2014 Fizi Yadav. All rights reserved.
//

#include <stdio.h>
#include <math.h>


double getCDF(double);
double stdnorm(double);
double getd(double S, double K, double r, double v, double T, double t);
double getCallPrice(double S, double K, double r, double v, double T,double t);
double getDelta(double S, double K, double r, double v, double T, double t);
double getGamma(double S, double K, double r, double v, double T, double t);
double getVega(double S, double K, double r, double v, double T, double t);
double getTheta(double S, double K, double r, double v, double T, double t);
double getRho(double S, double K, double r, double v, double T, double t);
void printGreeks(double S, double K, double r, double v, double T, double t);


int main(int argc, const char * argv[])
{
    
    double S = 40.0;   // Option price
    double K = 45.0;   // Strike price
    double r = 0.08;    // Risk-free rate. Percentages must be provided as decimal eg: (5%) as (0.05)
    double v = 0.05;    // Volatility of the underlying. Percentages must be provided as decimal
    double T = 3.0;     // Time of maturity. Must be in years
    double t = 0.0;     // Start time
    
    printf("\nParameters:\n");
    printf("Underlying Asset Price: %lf\n", S);
    printf("Strike Price %lf\n", K);
    printf("Risk-Free Rate: %lf\n", r);
    printf("Volatility: %lf\n", v);
    printf("Time to maturity: %lf\n", T-t);
    
    printf("\nCall Price: %lf\n", getCallPrice(S, K, r, v, T, t));
    
    printGreeks(S, K, r, v, T, t);

    return 0;
}


//calculate normal CDF given x

double getCDF(double x){
    
    double b0 = 0.2316419;
    double b1 = 0.319381530;
    double b2 = -0.356563782;
    double b3 = 1.781477937;
    double b4 = -1.821255978;
    double b5 = 1.330274429;
    double t = 1/(1+b0*x);
    //double stdnorm = 0.398942*pow(2.71828, -0.5*pow(x, 2));
    
    double prod = b1*t+b2*pow(t,2)+b3*pow(t,3)+b4*pow(t,4)+b5*pow(t,5);
    
    if (x >= 0.0) {
        return (1-stdnorm(x)*prod);
    } else {
        return 1.0 - getCDF(-x);
    }
    
}

//calculate call price of a european option

double getCallPrice(double S, double K, double r, double v, double T, double t){
    
    double d1=getd(S, K, r, v, T, t);
    double d2 = d1 - v*sqrt(T-t);
    
    printf("d1: %lf\n", d1);
    printf("d2: %lf\n", d2);
    
    return (S*getCDF(d1))-(K*exp(-r*(T-t))*getCDF(d2));
    
}

//print the greeks of a european call option

void printGreeks(double S, double K, double r, double v, double T, double t){
    
    /* some sensitivities are quoted in scaled-down terms, to match the scale of likely changes in the parameters. Rho is reported divided by 100 , vega by 100 (1 vol point change), and theta by 365 or 252 (1 day decay based on either calendar days or trading days per year). */
    
    printf("\nGreeks:\n");
    
    printf("Delta: %lf\n", getDelta(S, K, r, v, T,t));
    printf("Gamma: %lf\n", getGamma(S, K, r, v, T,t));
    printf("Theta: %lf\n", getTheta(S, K, r, v, T,t)/365.0);
    printf("Vega: %lf\n", getVega(S, K, r, v, T,t)/100.0);
    printf("Rho: %lf\n", getRho(S, K, r, v, T,t)/100.0);
    
}


// Standard normal probability density function
double stdnorm(double x) {
    
    return (1.0/(pow(2*M_PI,0.5)))*exp(-0.5*x*x);
}

double getd(double S, double K, double r, double v, double T, double t){
    
    return (log(S/K)+(r+pow(v,2)*0.5)*(T-t))/(v*sqrt(T-t));
    
}



// Calculate the European call Delta

double getDelta(double S, double K, double r, double v, double T, double t) {
    return getCDF(getd(S, K, r, v, T, t));
}

// Calculate the European call Gamma

double getGamma(double S, double K, double r, double v, double T, double t) {
    return stdnorm(getd(S, K, r, v, T, t))/(S*v*sqrt(T-t));
}

// Calculate the European call Vega

double getVega(double S, double K, double r, double v, double T, double t) {
    return S*stdnorm(getd(S, K, r, v, T, t))*sqrt(T-t);
}

// Calculate the European call Theta

double getTheta(double S, double K, double r, double v, double T, double t) {
    
    double d1=getd(S, K, r, v, T, t);
    double d2 = d1 - v*sqrt(T-t);
    
    return (-(S*stdnorm(d1)*v)/(2*sqrt(T-t)))
    - (r*K*exp(-r*(T-t))*getCDF(d2));
}

// Calculate the European call Rho

double getRho(double S, double K, double r, double v, double T, double t) {
    
    return K*(T-t)*exp(-r*(T-t))*getCDF(getd(S, K, r, v, T, t)-(v*sqrt(T-t)));
}