Engineering @Zeta

Presto(PrestoDB)- What it Offers and Where and How it can be used

What is Presto?

Presto (or PrestoDB) is a distributed, fast, reliable SQL Query Engine that fetches data from heterogeneous data sources querying large sets((TB, PB) of data and processes in memory. It originated in Facebook as a result of Hive taking a long time to execute queries of TB, and PB magnitude. The problem with Hive was that it would store intermediate results on disk which resulted in a significant I/O overhead on disk. In 2015, Netflix showed PrestoDB was 10 times faster than Hive. Presto is written in Java. It resembles a massively parallel processing (MPP) system that facilitates the separation between storage and computing and allows it to scale its computing power horizontally by adding more servers.

What Presto is Not?

Since Presto understands SQL, it is not a general-purpose relational database. It is not a replacement for MySQL, Oracle, etc., though it provides the features of a standard database. It was not designed to handle OLTP. Its main benefit and value can be seen in Data Warehousing and Analytics where a large volume of data is collected from various sources to produce reports. They fit into the world of OLAP.

Presto Architecture

Presto Concepts

Coordinator: This is the brain of presto. It receives the query from the client, parses, plans, and manages the worker nodes. It keeps a track of activity on worker nodes and coordinates the execution of the query. It fetches the results from the worker nodes and returns the final result to the client. Additionally, Presto uses a discovery service that is running on the coordinator, where each worker can register and periodically send their heartbeat. This runs on the same HTTP server — including the same port.

Worker: Worker nodes are the nodes that execute the tasks and process data. They fetch data from connectors and exchange intermediate data with each other. HTTP is the communication between coordinators and workers, coordinators and clients, and between workers

Connector: Presto uses a connector to connect to various data sources. In the world of databases, this equates to DB drivers. Each connector needs to implement 4 SPI(Service Provider Interface)

  1. Metadata SPI
  2. Data Location SPI
  3. Data Statistics SPI
  4. Data Source SPI

Catalog: The catalog contains schemas and references to a data source via a connector.

Schema: A schema is a collection of tables. In RDBMS like PostgreSQL and MySQL, this translates to the concept of Schema or a Database.

Table: Collection of data in terms of rows, columns, and associated data types.

Statement: Statements are defined in the ANSI SQL standard, consisting of clauses, expressions, and predicates.

Query: The previous SQL statement is parsed into a query and creates a distributed query plan consisting of a series of interconnected stages that contain all of the below elements.

Stage: The execution is structured in a hierarchy of stages that resembles a tree. They model the distributed query plan but are not executed by the worker nodes.

Task: Each stage consists of a series of tasks that are distributed over the Presto worker nodes. Tasks contain one or more parallel drivers.

Split: Tasks operate on splits, which are sections of a larger data set.

Driver: Drivers work with the data and combine operators to produce output that is aggregated by a task and delivered to another task in another stage. Each driver has one input and one output.

Operator: An operator consumes, transforms, and produces data.

Exchange: Exchanges transfer data between Presto nodes for different stages in a query.

Presto Connectors

Overall there are 30+ known connectors that Presto supports. The following are a few well know connectors that Presto supports.

  1. BigQuery Connector
  2. Cassandra Connector
  3. Elasticsearch Connector
  4. Hive Connector
  5. JMX Connector
  6. Kafka Connector
  7. MongoDB Connector
  8. Oracle/MySQL/PostgreSQL Connector
  9. Prometheus Connector
  10. Redis Connector
  11. Redshift Connector
  12. Delta Lake Connector

Event Listener

One of the nice things about Presto is clean abstractions, one such clean abstraction is Event Listeners. Event Listener allows you to write custom functions that listen to events happening inside the engine. Event listeners are invoked for the following events:

  1. Query creation
  2. Query completion
  3. Split completion

To Create Custom Listeners we would need to do the following:

  1. Implement EventListener and EventListenerFactory interfaces.
  2. Register the plugins and deploy the plugin to Presto.

Query Optimization

PrestoDB uses two optimizers. The Rule-Based Optimizer (RBO) applies filters to remove irrelevant data and uses hash joins to avoid full cartesian joins. This includes strategies such as predicate pushdown, limit pushdown, column pruning, and decorrelation. It also uses a Cost-Based Optimizer (CBO). Here it uses statistics of the table (e.g., number of distinct values, number of null values, distributions of column data) to optimize queries and reduce I/O and network overhead. The following are ways to see available statistics and see cost-based analysis of a query

SHOW STATS FOR table_name — Approximated statistics for the named table

SHOW STATS FOR ( SELECT query ) — Approximated statistics for the query result

EXPLAIN SELECT query — Execute statement and show the distributed execution plan with the cost of each operation.

EXPLAIN ANALYZE SELECT — Execute statement and show the distributed execution plan with the cost and duration of each operation.

SQL Language and SQL Statement Syntax

We can use DDL, DML, DQL, DCL, TCL which modern databases support. The following are supported in PrestoDB

  1. DDL — Create, Alter, Drop, Truncate
  2. DML — Insert, Delete, Call
  3. TCL — Commit, Rollback, Start Transaction
  4. DQL — Select
  5. DCL — Grant, Revoke

It also supports the following data types

Boolean, TINYINT, SMALLINT, INTEGER, BIGINT, DOUBLE, DECIMAL, VARCHAR, CHAR, JSON, DATE, TIME, TIMESTAMP, ARRAY, MAP, IPADDRESS

An Example with JMX Connector

Java Management Extensions (JMX) gives information about the Java Virtual Machine and software running inside JVM. With the JMX connector, we can query JMX information from all nodes in a Presto cluster. JMX is actually a connector that is figured so that chosen JMX information will be periodically dumped and stored in tables (in the “jmx” catalog) which can be queried. JMX is useful for debugging and monitoring Presto metrics.

To configure the JMX connector, create catalog properties file etc/catalog/jmx.properties with the following

connector.name=jmx

JMX connector supports 2 schemas — current and history

To enable periodical dumps, define the following properties:

connector.name=jmx

jmx.dump-tables=java.lang:type=Runtime,com.facebook.presto.execution.scheduler:name=NodeScheduler

jmx.dump-period=10s

jmx.max-entries=86400

We will use jdbc driver to connect to presto — com.facebook.presto.jdbc.PrestoDriver

the following is used to extract the JVM version of every node.String dbUrl= “jdbc:presto://localhost:9000/catalogName/schemaName”;

Connection conn = null;

Statement stmt = null;

try {

Class.forName(“com.facebook.presto.jdbc.PrestoDriver”);

conn = DriverManager.getConnection(dbUrl, “username”, “password”);

stmt = conn.createStatement();

String sql = “SELECT node, vmname, vmversion from jmx.current.java.lang:type=runtime”;

ResultSet res = stmt.executeQuery(sql);

while (res.next()) {

String node= res.getString(“node”);

String vmname= res.getString(“vmname”);

String vmversion= res.getString(“vmversion”);

}

res.close();

stmt.close();

conn.close();

} catch (SQLException se) {

se.printStackTrace();

} catch (Exception e) {

e.printStackTrace();

} finally {

try {

if (stmt != null) stmt.close();

} catch (SQLException sqlException) {

sqlException.printStackTrace();

}

try {

if (conn != null) conn.close();

} catch (Exception e) {

e.printStackTrace();

}

}

}

}

If we want to see all of the available MBeans by running SHOW TABLES, we can use SHOW TABLES FROM jmx.current

If we want to see the open and maximum file descriptor counts for each node then the following is the query — SELECT openfiledescriptorcount, maxfiledescriptorcount

FROM jmx.current.java.lang:type=operatingsystem`

Where can Presto be used?

  1. It can be used in Data Warehouse where data is fetched from multiple sources in TB and PB to query and process large datasets
  2. It can be used to run ad hoc queries from various sources through multiple connectors anytime we want and wherever the data resides
  3. It can be used for generating reports and dashboards as data is collected from various sources that are in multiple formats for analytics and business intelligence
  4. We can aggregate TBs of data from multiple data sources and run ETL queries against that data instead of using legacy batch processing systems, we can use presto to run efficient and high throughput queries
  5. We can query data on a data lake without the need for transformation. we can query any type of data in a data lake, including both structured and unstructured data as there are various connectors to pull from structured and unstructured sources

Author: Bhargav Maddikera

Effective Troubleshooting

Troubleshooting is a part of an engineer’s life. Whether it is API timeouts, issues with functionality, misconfigurations, or any number of other issues, we often need to roll up our sleeves and fix things. Based on my experience and tenure at Zeta, I would like to share some guidelines, learning resources, and tips and tricks that have helped me troubleshoot issues.

Guidelines

Incidents can come anytime and challenge us in new ways. Continuous preparation and learning equip us to solve the incidents. There is an ever-growing list that will continue to evolve and I have attempted to capture some key information with 40’s in mind.

The 4Os

  • Observability: Signals emitted by the application contribute toward observability. Lack of observability affects the MTTD because in the absence of the right signals, troubleshooting is based on hypothesis. Therefore, it would require some trial and error to confirm and replicate before fixing.
  • Operability: Controls to operate the system like turning on and off features, updating configurations, bumping resources, restarting applications, etc. Good operability controls help solve the incidents once the root cause is identified and helps reduce MTTR.
  • Optimization: No system can support 1 Million TPS from the first day. Optimization needs to continuously happen and keep up with the expected traffic from our customers. This involves not only code changes but also tuning configurations, choice of resources, etc.
  • Onboarding: The majority of incidents of lower severity or issues might be due to misconfigurations. Right onboarding with proper steps becomes crucial to avoid incidents related to this.

Preparing for Troubleshooting

Preparing the Application

  • Ensure your application is publishing the right Signals.
  • Use structured logging as it helps in capturing important attributes in logs like entityID, requestID and analyze logs end to end.
  • Design the system and APIs using operability in mind. Always design CRUD APIs for an entity and make sure you can use them to fix data issues, disable product features temporarily, etc.
  • If operability controls cannot be exposed as APIs, do expose them via JMX. Operations that can be performed via JMX are as follows:
    • Clear Cache
    • Change log levels
    • Disable features
    • Increase or decrease cache size
  • Get PGWatch and PGBadger enabled for all the PostgreSQL databases your application connects to. This helps in troubleshooting querying performance. These can be monitored regularly.
  • Having a good test coverage might not seem related to incidents, but having 80% test cases not only helps prevent incidents but can also help in reproducing issues locally.
  • Performance benchmarking the application, having a dedicated setup, and knowing the TPS of your APIs helps to know the right configurations for your application in production, the known supported TPS and doing this exercise is in itself a good learning.
  • Have runbooks handy around the flows with mitigation steps.

Preparing the Cluster

  • Ensure auditing is proper at all layers. This is very useful in determining which call landed for your APIs, how much time it took, and by whom it was called. 
  • Ensure all the applications emit appropriate signals and can be effectively monitored.
  • Ensure customer services owned by the cluster are well known and documented and runbooks prepared for it.
  • Ensure Configurations to operate the system are properly documented, idempotent, and have minimal steps. Keep on iterating them to include new features.
  • Maintain a runbook containing flows documented by application and customer service containing known issues and resolution steps.

Preparing Yourself

  • Be familiar with Observability Tools used. They provide a lot of insights while troubleshooting incidents.
  • Product Context helps a lot. Go through the resources like training videos, documents, and code to be familiar with the critical flows. Connecting the business context, domain, and technical context helps relate the issue with the impact and probable fault points.
  • Know thy system well and know the systems you depend on and the systems which depend on you better.
  • Familiarity with the tools like Kibana, Prometheus Queries, Grafana, Eclipse MAT, Kubectl, etc., helps a lot.

During Troubleshooting

  • While restarting to solve a problem, ensure you always take the thread and heap dump of the java process. It’s all about the evidence.
  • Always check if the problem is with only one instance due to multi-threading/concurrency issues. Deadlocks can cause that; in this case, taking a thread dump and restarting can quickly help resolve the issue.
  • There are many ways to solve a problem. Try to get out of the incident/issue first and then work on improvement. Sometimes out of the box solutions can save us a lot of time and get us out of tough situations.
  • When reporting an issue or passing the baton to another team, always provide as much supporting information as possible. Filling these in the FIR helps with triaging and sometimes can help in quick pointers based on a birds-eye view. Some examples are as follows:
    • Kibana Link containing logs
    • Inputs Passed
    • APIs called
    • Errors observed
    • Time Window
    • Grafana Dashboards Link containing key metrics which might indicate a problem
    • Sequence of Steps performed
    • Configurations related to the issue
    • Reference for Code for your team or others 
  • Actively report your observations in the preferred internal communication medium of triaging issues or incidents. It helps in keeping the stakeholders posted and might help in parallel debugging and some Eureka moments.

After Troubleshooting

  • Rigor in RCA and IAI is very important. Whether the issue is for one user vs. an issue for millions of users, it does not matter, as ignoring an issue with lower impact might lead to the issue getting escalated in terms of impact.
  • Always try to identify IAIs. The IAIs can be process related, product related, or tech related. It might be specific to one cluster or applicable across, which is also acceptable.
  • When doing RCA of an incident, consider the 5 Whys to be completed only when you know the root cause that will fix the issue for good and avoid reoccurrence.
  • Capture all the evidence in the RCA Document. Since links expire, capturing screenshots helps.
  • Do not forget to prioritize the IAIs.
  • All the evidence may not be available. Refer to the Tips and Tricks section on how to solve these.
  • Find IAIs which improve the 4Os.

Tips and Tricks

There are times when we may fall short on the observations and are unclear on what to do next. Some tips and tricks which can help are as follows:

Timeouts

  • Ensure common libraries used have the right instrumentation and logs to track ingress and egress flows. If available, use them to check logs around that time window.
  • Add logs and metrics around ingress and egress flows for an application and simulate again to reproduce.
  • Check the resources allocated to Kubernetes pods. CPU throttling of even 1% can impact the application heavily.
  • Check the code line by line from source to destination to see inefficiencies. Some of the common inefficiencies are:
    • Connection Pool settings for HTTP Calls.
    • Connection Pool settings for DB Calls.
    • Time taken by external calls. Percentile 95 and 99 metrics. Variations in them.
    • Requests getting queued in the executor used for external calls.
    • Time spent in Executor Queues.
  • Check the data as problems might be with setup and specific input as the associated data might be the reason for inefficiency.

Queries taking time on PostgreSQL RDS

  • Use EXPLAIN and ANALYZE to check the query plan.
  • If your database queries are taking more than 20ms Percentile 95, especially for a table with less than 1 million rows, assume there is a problem and start analyzing the problem.
  • Slowest individual queries in PGBadger helps.
  • Ensure RDS has sufficient resources and is not running low on CPU, Memory, or IOPS.
  • Enable Performance Insights to monitor instance performance if not getting an idea of the issue.
  • PGWatch has nice dashboards which capture very useful information about what’s happening in the database. Checkout Juno integration in OWCC and ensure PGWatch is enabled with dashboards getting populated.

Learning Resources

Tools

Author: Shubham Jha

Edits by: Mercy Mochary, Swetha Kommi

Reviewed by: Phani Marupaka

Zeta Tech Stack

Zeta is in the business of providing a full-stack, cloud-native, API-first neo-banking platform including a digital core and a payment engine for issuance of credit, debit, and prepaid products. These products enable legacy banks and new-age fintech institutions to launch modern retail and corporate fintech products. 

Zeta currently provides its platform and products to BFSI issuers in India, Asia, and LATAM. Zeta’s products are used by banks such as RBL Bank, IDFC First Bank, and Kotak Mahindra Bank, 14000 corporates, and over 2 million users. Zeta is a SOC 2, ISO 27001, ISO 9001:2008, PCI DSS certified company.

How does Zeta continuously deliver at these high levels without compromising on security? Let us take a look at Zeta’s tech stack to find out.

Server Applications

A server application is designed to install, operate, and host applications and associated services for end-users.

Compute Runtimes

Technology usedKubernetes, GitOps, Apache Openwhisk, Apache Flink, and Camunda. 

Following are the runtime applications in Zeta.

  • Atlantis is a docker runtime container deployed using GitOps based end-to-end orchestration with the help of Kubernetes.
  • Aster is an open-source serverless platform run on Apache Openwhisk that runs various short-lived jobs on-demand. 
  • Rhea is an application used to execute BPMN workflows that run on the Camunda platform.
  • Perseus is an adaptation of Apache Flink to run complex event processing tasks. Apache Flink is a framework and processing engine used for computations at any scale.

We use Camunda to define workflows and automate decision-making. 

Apache Flink provides a high-throughput, low-latency streaming engine and supports event-time processing and state management. It has allowed us to perform computations at any scale.

Cluster Management

Technology usedKubernetes, Kong, and Calico. 

A cluster management tool is an essential service used to group the clusters and nodes. This is used to:

  • Monitor nodes in the cluster.
  • Configure services.
  • Administer cluster services. 

We prefer using open-source systems in our tech stack. That is why we use Kubernetes as our orchestration system to efficiently manage and deploy containers. 

Ingress and API Gateways run using Kong. Kong supports high availability clusters and an extensive range of plugins to address various concerns like authentication, security, and monitoring.

We have also developed 2 in-house tools, Sprinkler and Hades. Sprinkler is our Egress gateway and Hades is our file input/output gateway. 

We use the Calico plugin to manage the container network interface.

We are evaluating Istio to adopt it as our open service mesh. A service mesh controls how different parts of an application share data with each other.

Messaging Infrastructure

Technology usedApache Kafka, Apache Nifi, Apache Flink, KSQL, and Kafka.

A messaging system transfers data from one application to another. We use the following components to seamlessly transfer messages within Zeta’s various applications and clusters.

  • Apache Kafka is used as a message broker. It allows us to handle high volumes of data and passes messages from one end to the other. 
  • Atropos is a message-based integration bus based on Kafka developed by Zeta.
  • Apache Nifi is used to implement ETL Pipelines. 
  • Perseus, developed by Zeta, uses Apache Flink to process complex events. 
  • Sinope uses KSQL and Kafka connectors to bridge the message/event world and the batch/file world. KSQL and Kafka connectors enable rich data integration and streaming.

Data Stores

Technology usedPostgresql, Amazon S3, Amazon RedShift, ClickHouse, and Redis.

Postgresql is used to store transactional data and Amazon S3 is our data lake. We use Amazon RedShift and ClickHouse as our data warehouse and Redis, an open-source memory data structure, to store cache.

Application Performance Monitoring (APM) and Business Performance Monitoring (BPM) uses the ElasticSearch engine.

Data Processing

Technology usedPresto SQL, Sparta, and Apache Superset.

The data processing unit is an essential part of any server application. It converts data available in the system (machine-readable data) to a human-readable format. 

We use the Presto SQL query engine to process big data and Sparta for stream processing. Stream processing allows us to convert data in motion directly to continuous streams. 

The Power Centre stores various data models and allows us to modify data and enables the reuse of data. The Power Centre is an adaption of Apache Superset.

CI/CD

Technology usedJenkins, ArgoCD, and Sonarqube.

Jenkins is used for Continuous Integration (CI) and ArgoCD for Continuous Deployment (CD).

Our servers use Sonarqube, an open-source platform, for static code analysis.

Observability

Technology usedGrafana, Kibana, Fluentd, and Hypertrace.

Observability is defined as the ability of the internal states of a system to be determined by its external outputs. External outputs can be anything such as alerts, visual representation, and  infrastructure tracing.

We have developed Prometheus as our event monitoring system. Prometheus stores all metrics in the database. 

We use Grafana for our dashboards and alerts and Kibana for log analysis. The purpose of both these applications is to visualize and navigate data. We also use Fluentd, an open-source data collector, for logs and metrics collection.

All applications should have a distributed tracing system. Hypertrace is our tracing system. It helps us debug applications and log information about their execution.

Security Monitoring

Technology usedHELK and Jackhammer.

Security monitoring is another essential unit to analyze data to detect any suspicious behavior or changes to the network. 

HELK is an ELK (Elasticsearch, Logstash & Kibana) stack with advanced hunting analytic capabilities provided by the implementation of Spark and Graphframes technologies.

Kratos, an in-house adaption of Jackhammer, is our Security CI.

We are evaluating RockNSM, a premier sensor platform for Network Security Monitoring (NSM) hunting and incident response (IR) operations. 

Hardware Security Module (HSM)

Technology usedSafenet, Gemalto and nCipher/ Entrust nShield Solo.

HSM manages digital keys, encryption and decryption, and provides strong authentication. We use the following HSMs:

  • Safenet and Gemalto as Network HSM.
  • nCipher/ Entrusy nShield Solo.
  • Harpocrates, developed by Zeta, is our HSM as a service interface. 

Mobile Application

Technology usedFirebase Hosting, Websockets, HTTPS.

Languages usedKotlin, Java, Objective C, Swift.

We develop mobile applications as native apps, React Native apps, and web apps.

Kotlin and Java are used to develop our Android applications and Objective C and Swift for our iOS applications.

Firebase Hosting helps with Analytics and Crash detection in the server. Websockets and HTTPS make it possible for two-way communication between the server and the user.

Web Applications

Technology usedTypescript, SCSS transpilers, Vue.js, Bulma, Buefy, Webpack, Rollup, Verdaccio, Lerna, and Sentry.

Languages usedHTML, CSS, Javascript, Node JS, and Express JS

We use HTML, CSS, Javascript, Node JS, and Express JS to develop our web applications.

Typescript and SCSS transpilers convert program code from one language to another.

Vue.js is our Javascript framework and Bulma is our CSS framework. Buefy is a user interface component and is made using both Bulma and Vue js. Webpack and Rollup are the JavaScript build tools. 

Verdaccio, an NPM registry, serves the purpose of a Private NPM repository. The repository is managed by Lerna, a workflow optimization tool. Sentry, an Application monitoring tool, is used to monitor errors.

Cyber Security

Technology usedTrivy, Clair, CloudSploit, MobSF, ZAP

The challenge when building a product is how we balance ease of use with security. At Zeta, we use Trivy for CI docker scanning and Clair for CD docker scanning.

CloudSploit is our Cloud Security Posture Management.  We use MobSF for mobile application security.

ZAP for web applications is all in CI/CD.

Best Tech Stack = Best Performance?

Building a system that can provide these functionalities and process over 1 million transactions per second is not easy. While Zeta uses the best tech stack, building a robust framework to continuously deliver at the highest levels without compromising on security would not have been possible without Zeta’s talented engineers.

Buy Now Pay Later: How Does it Work?

In our previous 2 blogs (BNPL: The Modern-day Credit Card & BNPL: The Real Deal or Just Hype), we compared the buy now pay later model to traditional credit cards and looked at its market share, both globally and in India. While it is a viable alternative to credit cards, has gained a lot of worldwide traction, and is capturing a lot of market share, how does it work?

Operating Models

Buy now pay later has multiple operating models.

  • EMI with interest
  • EMI without interest
  • Deferred payment model
  • Down payment and loan model
  • Hybrid model. Allows users to defer the down payment and the balance is converted to EMIs.

Model

EMI with interest

Here, the entire bill amount is converted to a loan.

  • Interest rates vary between 12% — 36%.
  • 2 months to 36 months loan term.
  • EMI amount includes interest charged. EMI = (Principal + Interest)/Tenure.

EMI without interest

Here, companies refund the interest as a cashback to the customer or as a discount on the price of the product or service.

  • Effective interest rate of 0%.
  • Fixed or short-term loans (up to 12 months).
  • EMI does not have an interest component. EMI= Principal/Tenure.

Deferred payment

Here, the bill amount is converted to a short-term loan, which the customer pays back to the company within a fixed duration (typically 15 or 30 days).

In this scenario, companies may charge the customer processing fees or convenience fees.

Down payment loan model

This is similar to traditional loans. A customer makes a purchase by making a downpayment and the balance amount is converted to EMIs.

Here, companies charge customers an interest, which is added to the EMI.

This is mostly used by companies such as Bajaj Finserv and Home Credit for offline transactions. Each transaction is treated as a separate loan and reflects in the customer’s CIBIL report.

Hybrid Model

This is a combination of the deferred payment and down payment loan model.

Customers can pay the entire bill amount back within a fixed duration or convert the amount to EMI.

Charges

Typically, companies do not charge customers registration fees or annual maintenance fees to use their buy now pay later product. This, however, does not mean that there are no charges when you make payments using the buy now pay later option.

Apart from interest, companies charge customers processing and convenience fees, among other fees. While these are lower than what traditional credit cards would charge you, they are not non-existent.

Fees

Interest on loan amount

Interest charged on the loan. Varies between 12% and 36% and it can go up to 66%.

The interest rate depends upon the company’s business model, total loan amount, product or service purchased, and loan tenure.

Processing fee

A fee is charged to process the transaction. Typically 0%. However, some companies might charge a small amount.

Convenience fees

A fee is charged to use the service. Varies from company to company.

For example, Paytm charges 0% — 3% on your net monthly spending.

Late fee

Fee charged when you miss your payments. This varies from company to company and the outstanding amount.

Interest on non-payment of minimum amount

The interest charged when you do not make the minimum monthly payment.

Varies from company to company. Could be between 18% and 36%.

Preclosure charges

Amount charged to close your loan early.

Varies from company to company. For example, LazyPay charges 4% of the outstanding amount when you close your loan early.

Revolve fees

Fee charged when you request for an extension on the due date. Varies from company to company.

When you request for an extension, you are not charged late fees or interest for missing your minimum monthly payment.

If granted, allows you to make payment at a later date without it being marked as a non-payment by the company and your CIBIL score is not affected.

Activation fees

Most companies do not charge you an activation fee. However, some do. For example, Mobikwik charges you a 1-time activation fee of ₹99.

Other fees

Some companies have others charges such as joining fees, subscription fees, annual charges, or EMI bounce charges.

For example, Dhani Pay has a subscription fee varying between ₹199 and ₹1,799 per month.

Companies such as Bajaj Finserv and Home Credit charge an annual fee of ₹99.

Future of Buy Now Pay Later

There has been an increased uptake for the buy now pay later model in the educational sector as well. School and college fees are not small payments, especially when they are paid quarterly or annually. The buy now pay later model has made it easier to make these payments.

UPI payments have made digital payments more accessible to everyone in India. The tea shop down the road and even your local barber accept UPI payments. Companies such as LazyPay have integrated buy now pay later with UPI to capitalize on the popularity of UPI and make buy now pay later more accessible.

Other companies such as Slice, ZestMoney, Mystro, Krazybee, and Lazypay are exploring new ways to offer buy now and pay later to customers. One such idea is to offer gift vouchers on EMI. Users can purchase gift vouchers of different amounts and different brands and pay for them using the buy now pay later method. This means financial institutions need not tie up with merchants to offer buy now pay later to their customers. This will make a popular product even more popular and more accessible.

Thank You

Phani MarupakaHemchander Gunashekar

Buy Now Pay Later: The Real Deal or Just Hype

In our previous blog (BNPL — The Modern Day Credit Card), we compared buy now pay later to traditional credit cards. We compared the two products and spoke about the different types of buy now pay later offered to customers. But, is buy now pay later all it is made out to be or is it just a flash in the pan?

Global Market

Globally, providers such as Klarna, Afterpay, and Affirm have paved the way for the buy now pay later ecosystem. In 2019, buy now pay later had a global market size of US$7.3 billion. Coherent Market Insights, expected it to grow to US$33.6 billion by 2027 at a CAGR of 21.2%.

While Australia and Sweden were the top markets a few years back, the buy now pay later trend has caught on in the UK and the US.

Worldpay suggests that buy now pay later spending in the UK would rise from £9.6 billion in 2020 to £26.4 billion in 2024. According to Fobes, Americans made $20 billion worth of purchases using BNPL programs in 2019 and will spend $24 billion on products and services using a BNPL service in 2020.

Company

Key Metrics

Afterpay

  • Operations in Australia, USA, UK, New Zealand.
  • 7.3 Million users.
  • Major Market: USA
  • Market Cap: $17.7BN

Klarna

  • Operations in 15 countries.
  • Market Cap: $33BN

Affirm

  • Operations in USA & CANADA
  • Market Cap: $16.73 Bn

Indian Market

The trends are similar in India. Businesswire expects the buy now pay later payment adoption to grow steadily between 2021 and 2028, recording a CAGR of 24.2%. It expects the gross merchandise value in India to increase from US$ 6.990.5 million in 2020 to US$ 52827.2 million by 2028.

Buy now pay later is offered by multiple companies in India. LazyPay, ZestMoney, and Simpl are few of the pure buy now pay later companies. Other e-commerce companies such as Amazon and Flipkart, wallet providers such as Paytm and Mobikwik, banks such as ICICI and HDFC, and payment gateways such as Razorpay offer this as an option to their customers.

Company

Key Metrics

LAZYPAY (PayU)

  • 4 million active users.
  • 2 million transactions per month (December 2020).
  • 250+ merchant base.

ZestMoney

  • 6 million registered users in (February 2020).
  • Average transaction: Rs. 12,000/-
  • Tie-up with 15,000+ merchants.

Flipkart PayLater

  • 65 million existing Flipkart users.
  • Active on Flipkart, Myntra and 2Gud.
  • Focuses on customers tier 2 and tier 3 cities.

Paytm Postpaid

  • 7 million users (November 2020).
  • Tie-up with 5 lakh+ merchants.

ICICI PayLater

  • Available only to ICICI Bank customers on ICICI platforms such as Pockets, iMobile App, and their bank website.
  • Default payment option Razorpay buy now pay later customers.

Bajaj Finserv

  • Available all over India.
  • Available in offline and online stores.
  • 8000+ stores online stores
  • 1 lakh+ offline stores in 1900+ cities

PineLabs

  • Provides buy now pay later via POS machines
  • Has 95% of the offline buy now pay later market share.
  • 30 million users.
  • Tie-up with 1.5 lakh merchants.

Here to Stay

Buy now pay later is an attractive offering from financial institutions. In 2020, buy now pay later accounted for 2.1% of ecommerce transactions worldwide, according to Worldpay. This number is expected to double by 2024.

In India, buy now pay later grew over 30% in 2019. According to a survey conducted by ZestMoney in 2020, 60% of women surveyed said they would use buy now pay later and 51% said they prefer it over credit cards.

Buy Now Pay Later: The Modern Day Credit Card

Zeta, the 14th Indian unicorn in 2021, is rethinking payments from core to the edge, algorithms to form factors, applications to solutions. Cipher, one of Zeta’s offerings, was able to successfully handle 1 Million Transactions per Second (1M TPS), illustrating the scalability and elasticity with which banks can handle online transactions. In our endeavor to make payments invisible, we are exploring and evaluating the buy now pay later payment (BNPL) model.

BNPL allows consumers to make purchases without any upfront payments. A modern-day offering from financial institutions, BNPL offers consumers small loans to buy the products or services they want. These loans are typically interest-free as long as they are paid back within the specified time frame.

Key features of BNPL are:

  • Easy and instant credit to people in the 18–40 age group with little to no credit history.
  • Offer interest-free EMIs to the consumers for online or offline purchases and includes school fees, college fees, and medical bills.

BNPL vs Traditional Credit Cards

In today’s world where customers are spoilt for choice and want to be rewarded for making purchases, taking more than a few minutes to onboard a customer and adding extra charges and hidden fees to their payments is a turnoff. Financial institutions also want to capitalize on the trend of instant buying by offering a quick checkout experience where first-time customers do not have to enter details such as card number, CVV, and OTP to make a purchase.

BNPL has grown in popularity with customers and financial institutions because of its various benefits over traditional credit cards.

BNPL accounts can be created instantly within minutes making it the more convenient option for first-time users. Credit cards traditionally could take up to a week to be approved and then there is the time it takes for it to be shipped to the customer. The customer then has to activate it before they can use it.

Unlike traditional credit cards, BNPL charges very little in terms of registration fees, annual charges, processing charges, convenience fees, etc. making it more appealing to customers.

This is not to say BNPL does not charge customers anything extra. In the BNPL model, fees are mostly charged in the form of interest for the borrowed amount and late fees for missed payments. However, most financial institutions offer 0% EMI on BNPL payments, making it even more popular. This has helped rapidly grow the BNPL customer base.

Credit Cards: A thing of the past?

The BNPL industry has grown both globally and in India over the last few years. The 2020 pandemic has only increased its popularity.

In India, the payment option has grown in popularity in tier 2 and tier 3 cities where consumers do not have easy access to credit cards or cannot get easy short-term, low-interest loans.

BNPL has numerous advantages over traditional credit cards. Traditional credit cards still have a place, especially for corporate applications, but the benefits and convenience BNPL offers are sure to see it continue to grow in popularity over the next few years.

Up Next…

In our next blogs (BNPL: The Real Deal or Just HypeHow Does BNPL Work) , we will discuss the global and Indian BNPL market and go into detail about how the BNPL model works.

Data deletion & right to be forgotten

Shaik Idris, Director of Data Platforms at Zeta, was part of a roundtable discussion at Rootconf’s Data Privacy Conference held in April 2021. The roundtable was moderated by Venkata Pingali from Scribble Data and included Sreenath Kamath from Hotstar.

The roundtable focused on handling data deletion practices in engineering and product.

Why is data deletion important?

In today’s data-driven world, user data is very valuable. We have a number of cases where user data has been misused or leaked. Maintaining customer satisfaction and trust is critical for any company. Data deletion ensures a customer’s data cannot be used for anything else apart from the intended purpose.

What are the legal obligations of a company when it comes to data deletion?

The law requires companies to inform customers why they are collecting data from them and where it will be used. It also requires companies to provide customers with the:

  • Right to know what information the company has about them.
  • Right to ask the company to delete any personal information they have.

Retrospectively making a company General Data Protection Regulation (GDPR) compliant is impossible. This is because a lot of existing companies ignore the first principle of GDPR, privacy by design. New companies have the flexibility to model data from the onset to ensure they are compliant with the regulations.

Data Deletion: A Company’s Legal Obligation

What is realistically possible in a complex, evolving organization?

Steps companies can take are:

  • Identifying data sources. This problem is especially big in startups.
  • Streamlining the data model. Customer data should be treated and managed as master data. A lot of companies do not do this.
  • Don’t use personally identifiable information (PII) such as phone number or email address as your primary key.
  • Ensure you always use structured data sets.
  • Use proper naming conventions for columns and variables.
  • Have processes to avoid internal data leaks.

Data Deletion: Realistic Possibilities

Can we use tools to help implement the above suggestions?

Tooling helps. We can use tools to scan data sources to identify and flag inappropriately named columns or even auto tag all instances where customer data was stored. However, for tools to be effective, companies should have dedicated data stewards for each data domain.

Data Deletion: What tool can we use

How can you say that you have realistically implemented delete?

First, we must differentiate between what data a company should retain for audit purposes and what data can be deleted. For example, financial institutions are required to retain information for 7 years for audit purposes. Information collected for marketing purposes, on the other hand, can be deleted.

Few things companies can do are:

  • Collect the data you need. Keep this as minimal as possible.
  • Unless required by compliance, do not retain data. Any data you retain is a liability.
  • Anonymize PII in data sources.
  • Guard your communication channels and maintain a list of users who want to be forgotten.

Data Deletion: Realistic Implementation

Conclusion

The law allows users to discover what a company knows about them and ask companies to delete their personal information. There is ambiguity about what it means to delete information. The law requires companies to demonstrate that they have tried to delete a user’s personal data. This is not possible if companies have an uncontrolled, undisciplined data environment.

Companies can use tools to ensure the proper handling of customer data. But, as the saying goes, prevention is better than a cure. By collecting only the required data and deleting data that you are not required to retain, companies can avoid the hassle of data deletion and data leaks.

About Shaik Idris

Shaik Idris is an experienced architect and proven leader in the field of BigData and Cloud. He has worked in top startups, product companies, and with open-source codes for over a decade. He has helped companies build high-performance teams and data organizations from scratch.

Zeta — is rethinking payments from core to the edge, algorithms to form factors, applications to solutions. Having built a modern stack that Financial Institutions (FIs) can use for debit, credit, and prepaid cards, loans, authentication, and Fraud and Risk Management (FRM), Zeta invites you to join their journey in democratizing payments. Check out the openings on Zeta’s career page: https://www.zeta.tech/in/careers

Thank You

Speaker: Idris Ali

Blogged by: Hemchander Gunashekar

Program Management: A Beginner’s Guide

Rohit Kamat, Program Manager at Zeta, talks about his experience as a program manager and key learnings from his time at Zeta. He talks about topics such as stakeholder management, defect management, and key learning from sprint and scrum planning.

Stakeholder Management

When working on a project, managing your stakeholders is just as important as managing your deliverables. Unless you identify, analyze, plan, and review your stakeholders, you would not know who these people are, what they are responsible for, and what their expectations are from the project and you.

Similar to how you do not want to over-assign tasks to people working on your projects, it is also important to not burn yourself out talking to a lot of people. As a program manager, you must know how to prioritize stakeholders and derive the most value from interactions with them.

There are multiple approaches to identify and prioritize stakeholders, let us take a look at 2 of these.

  • Power Interest Grid
  • Stakeholder Salience Model

Power Interest Grid

This chart maps the authority or power of stakeholders against how much interest they take in the project. Based on the X and Y axis, we can decide what type of information a stakeholder would need and what type of engagement we would want to have with them.

Stakeholder Salience Model

This chart maps the types of stakeholders against power, legitimacy, and urgency.

Explainer video- https://zeta.ap.panopto.com/Panopto/Pages/Embed.aspx?id=de7b601b-1906-45e3-8481-ad21007ef87f

Key Learnings from Sprint and Scrum Planning

We know how important sprint and scrum planning is in any organization. What we don’t talk about is its implementation. Poorly planned and executed sprints could hamper progress, rather than streamline it. For example, having someone who is not part of the team plan a sprint could be a recipe for disaster.

Explainer video — https://zeta.ap.panopto.com/Panopto/Pages/Embed.aspx?id=d114e9e9-c3b1-4f95-8c22-ad21007f4b01

Below are some key learnings from sprint planning:

  • To change the system you have to be part of the system.
  • Reflect → Tune → Adject. Revisit your scrum practices often and make changes as needed. If something works well, figure out why it has worked well and how it can help other aspects of the scrum. If something is not working well, fix it.
  • The key to early and continuous delivery is communication and transparency.
  • Your scrum or sprints do not have to align with those of your clients. Do not force this.
  • Do not divide scrum teams into smaller teams.
  • Achieving a velocity target should not be the objective of a scrum. It should be to find new/different ways to achieve it.
  • Rituals do not make you agile.

Defect Management

How well you manage defects defines your last-mile delivery success. Listed below are some ways you can track defects.

  • JIRA Dashboards help you list and track defects.
  • Triages help you understand the priority of defects.

Explainer video — https://zeta.ap.panopto.com/Panopto/Pages/Embed.aspx?id=a0b1ac42-a7df-40e3-8b14-ad21007f4f9d

Conclusion

Program management is a critical role in any organization that requires you to interact with various stakeholders across multiple disciplines. Your ability to obtain the required information from these stakeholders while keeping them informed about progress is essential to the smooth functioning of a team and ensuring deadlines are met.

Speaker: Rohit Kamat

Blogged by: Hemchander Gunashekar

Multi-tenant SDK Authentication

Apollo App Center is a marketplace for SDKs. Each SDK can be published by different publishers. Each publisher has its own identifiers for its users.

A few examples:

  • Google Pay app may have SDK from ICICI Bank, HDFC Bank, and SBI Bank.
  • Ola app may have SDK from Google Maps, Weather Channel, and ICICI Bank.
  • Zeta Benefits Wallet app may have RBL Cards SDK, CCAvenue Payment Gateway SDK, Razorpay Payment Gateway SDK, Stripe SDK, etc.

In all of the above cases, the app hosting the SDKs wants to integrate various services from third parties using the SDKs provided by those parties to provide increased functionality to its users. But each of those SDKs represents isolated services from that of the parent app and the user of the app could be using those services from various other interfaces outside the purview of the parent app or parent app provider.

Thus, the identity domains and thereby the authentication domains of the app and the included SDKs are different. Unless there is some federation of identity and authentication, the user of the app has to establish their identity using potentially different authentication mechanisms of each SDK. For example, a Zeta Benefits Wallet user may have to log in several times within the app to access services provided by different SDKs. An Ola app user would have a similar experience.

The objective of the authentication federation approach is to avoid the need for each SDK publisher to authenticate the user separately.

Why not conventional mechanisms?

In the conventional OAuth flows the App Provider is treated as a client of the User of each publisher and authorization is provided to the App Provider to access resources corresponding to the User.

In sensitive domains such as banking and payments, the user resources cannot be accessed by any third-party service provider. The Publisher ensures only the end-user is accessing the resources using the SDKs provided by the publisher.

Therefore, although the SDKs are embedded by the app provider, the app provider themselves are not acting as a client of the user. Thus, a conventional OAuth approach to federate identities from publisher to app provider is not possible.

The solution implemented federates the identity provided by the App Provider and the Publisher is made the relying party. The publisher may rely on the identity and credentials provided by the app provider as is, depending on the reliability of the app provider’s authentication, or may offer additional challenges to the user before granting access to the service.

Shown below is the Authentication Flow.

  1. Establish identity with App Providers.
  2. Forward that identity established in step 1 to Publisher1 in a reliable way.
  3. Publisher1 verifies the forwarded identity and grants access to the user as Identity@Publisher1
  4. Forward that identity established in step 1 to the Publisher2 in a reliable way.
  5. Publisher2 verifies the forwarded identity and grants access to the user as Identity@Publisher2.

Setup an Authentication mechanism for the SDK

Zeta’s entire SDK range uses the above authentication mechanism through the Apollo App Center. Different teams when working on their respective mobile SDKs use the Apollo App Center to publish and distribute their SDKs. The consumers of these SDKs, for example. Fintechs, Neobanks, and VBOs can sign-up for the SDKs through the Apollo App Center. We are continuously working towards extending support for different banks and financial institutions to publish their own SDKs on the Apollo App Center.

  • Memory optimization
  • Multi-modality

Resource crunch

As mentioned above, we might have a situation where we want to integrate multiple SDKs hosted on the Apollo App Center. Since we already have the authentication module built, the first thought that comes to our mind is to create multiple instances of authentication objects, one per SDK. This is resource taxing and non-scalable. Zeta is continuously building newer SDKs to cater to client requirements. With each new SDK, we instantiate new objects that could be highly unoptimized for resource-constrained mobile devices.

Enter REST mobile modules!

We want all modules we build to start acting as small microservices following the REST principle of statelessness. The idea is to pass the information needed by the module as parameters, in the context of an API call, instead of passing them as initialization parameters. This helps us in avoiding the creation of multiple instances for the same class.

In the above diagram, you can see that there is a single instance of the authentication module object in memory. The consumer of the authentication module passes the authentication context as part of the API call.

Multi-modality of authentication modules

Different SDKs need different authentication mechanisms to make authenticated API calls. Let’s say that the Cards SDK is making a simple REST API call to generate authentication tokens. On the other hand, the Payment SDK uses an authenticated socket connection to make the API call. With this requirement in mind, the authentication module now supports different types of authentication clients.

All the SDKs can now tell the authentication module the type of authentication clients they want to use. In the diagram below, you can see that Cards SDK makes a request to initialize its auth context by generating an auth token with an HTTP API call; whereas the Payment SDK needs a separate socket connection, which in itself would be authenticated.

Here is a short code snippet showing how simple it is to specify the authentication client inside the SDK.

if(useWebSocket) { ApolloUserauthmanagerCompFactory.init(ApollouseWebSocketAuthSessionComponentFactory.getInstance().authSessionClient())
} else { ApolloUserauthmanagerCompFactory.init(ApolloRestAuthSessionClientFactory.getInstance().authSessionClient())
}

Any requests generated from the Payment SDK can then be directly made through this channel. The WebSocket channel gives additional capabilities to listen to the push messages from the server and helps in building a true reactive behavior on the client-side.

Conclusion

In this article,

  1. We have discussed the challenges in authenticating different SDKs in the banking domain.
  2. We discussed the Apollo App Center as a marketplace for the SDK publishers and consumers.
  3. We also discussed the solution for authenticating SDKs and the client implementation for the same for the mobile platforms.

Thank You

Written by: Swapnil Gupta , Apurva Jaiswal

Fund Collection Service

The fund collection service was built out to solve a problem in Rubix, the Integrated Business Travel and Expense Management product developed at Zeta.

The problem can be described as follows:

We needed to issue some funds into the fusion** funding account of the company. For doing this, the company has to deposit the respective amount into some accessible physical/reachable bank account of Zeta. After receiving the funds, Zeta on behalf of the company will issue funds into the fusion** funding account of the company.

If we want to generalize this problem we can describe it like this: A person wants to transfer the funds from any type of source bank account to any type of destination bank account.

To generalize the problem consider the following use cases which will help us build the core modeling of the problem.

  1. The person doesn’t have the required currency to credit the destination account, let’s say the source account is a USD account and the destination account is an INR account. It may also be possible that the destination account is a closed-loop rewards point account.
  2. The destination account is not accessible to the customer, for example, if the destination account is a virtual account or an NRE(Non-residential External) account, in these cases a direct transfer of funds is not possible.
  3. The person wants to do several small transactions and the transaction itself accrues some cost, in certain cases doing a bulk transaction could be less expensive. Thus a service provider could facilitate these transactions incurring fewer expenses.

All the above use cases can be fulfilled by an external service provider.

We can have 3 types of accounts that can facilitate these transfers.

  • Destination/FundingAccount — The account to which the customer eventually intends to transfer the funds.
  • Collection Account — The physical and reachable bank accounts owned by the service provider which it uses to collect the funds from the customer.
  • Reserve Account — The account owned by the service provider will be used as the source for the movement of funds into the destination account. It’s important to note here that the reserve account and the destination account have to be transactional.

Just follow the below diagram for the actual interaction of all the entities.

Apart from the core model, there are some application-specific business processes. The fund collection service will require an extension to service these processes. A brief overview of these processes:

  1. Mechanisms provided to deposit funds into Collection Account.
  2. Mechanisms provided to notify the Service Provider about deposits into Collection Account and request a credit to the Funding Account.
  3. The Reserve Account to be used for each request from the customer.
  4. Business constructs for fulfilling the requests:
  5. Value conversion rules: Deposit X value and Receive Credit for Y value. (The currency of X and Y could be different)
  6. Fee and charges for the service; Could be expressed in terms of destination account currency, source account currency, or both. (There could be multiple parameters for accessing this; We could keep this out of scope)
  7. Mechanisms provided to publish the relevant Collection Accounts to a customer. Only a subset of Collection accounts owned by the service provider may be relevant for the purpose of the Customer.

Currently, the service has been extended to handle the fusion** transfers.

** Fusion: Fusion is a platform by Zeta to help all stakeholders. It is a BaaS (banking-as-a-service) platform, for fintech developers, for managing accounts, issuing physical, digital, or tokenized cards and controlling spends on channels, levying fees/charges/interest, etc. In its simplistic form, Fusion provides you with a set of APIs that can help you build and solve for your fintech use case that you are going after thereby reducing your prototyping cost, iterations to minimum-viable-product, and time-to-market for your go-to-market product.

Thank You

Speaker- Priya Panthi

Edited by- Phani Marupaka