PCF stands for Pivotal Cloud Foundry. Cloud Foundry is an open source Platform as a Service technology. It was originally created at VMWare and open-sourced. Pivotal (which now once again is part of VMWare) has its own Cloud Foundry offering. It takes the original Cloud Foundry open source project and combines it with other tools and technologies into a more enhanced Platform as a Service offering. While it can run single stand-alone apps or monoliths, Cloud Foundry shines with Cloud Native Applications. These are applications which embrace the 12 Factor App principles in order to provide scalability, resilience and consistency. A software product which is created following these principles usually consists of a suite of microservices. These are developed, deployed and monitoried by a number of small cross-functional teams embracing DevOps and Continuous Delivery philosophies in order to very frequently provide value for end users and receive feedback. The feedback is used to improve, not only the end product itself, but also the team processes and methodologies that go into developing, deploying and monitoring the software that produces the product.
So what’s this 12 Factor App thing?
It’s a set of heuristics to guide product engineering teams towards the development of applications (or microservices) that best take advantage of cloud tools and technologies to accomplish resilience, scalability and consistency. A whistle stop tour would be:
- Treat your logs as a stream of events.
- One codebase for but multiple deploys of each microservice.
- Build and run processes should be separated.
- Any management or admin tasks should be run as once off jobs.
- Don’t have any shared dependencies. Each microservice should explicitly declare and isolate its dependencies.
- If a microservice uses a backing service such as a DB, treat this as a resource.
- If you want to scale out a microservice, use the process model to scale concurrently and/or scale horizontally across containers/VMs.
- For a microservice to be useful, it usually needs to be exposed to other services or the outside world – use port binding to do this.
- Don’t use config files. Instead specify a microservice’s config in the environment that is given to the microservce.
- Model each microservice as one or more stateless processes – so no sessions please.
- Your microservices should start up quickly and should shutdown gracefully.
- No suprises with the environment please – your dev, staging, prod etc. environments should be as consistent as possible.
Generally "best practices" aren’t really "best practices". They are heuristics that a lot of teams have found to work well for them. So it’s best not to be too dogmatic about these but they do serve as a good guide.
So where does PCF run my apps?
This can be anywhere really – on premise, AWS, GCP etc. Cloud foundry has an abstraction layer over infrastructure as a service (IAAS) providers. This is the main engine of cloud foundry and is called Bosh. It means that microservices deployed on cloud foundry can run on top of most IAAS providers or can also run on premise if you have the hardware resources available to run a cloud foundry installation at the scale you desire. Pivotal provide a service called Pivotal Web Services (PWS) at https://run.pivotal.io/. This is the quickest way to get started with PCF. In a previous blog post, I showed how to get up and running here with a Kotlin/Spring Boot microservice.
So is it similar to packaging and deploying apps with docker images and containers?
Yep this is quite similar but not exactly the same. While cloud foundry does support running docker containers, it also can package binaries and their dependencies into what is called a Droplet. This is kind of like a docker image. This Droplet can then be instantiated inside a cloud foundry container called a Garden container. A container is a logical unit for a process and it’s dependencies. It runs on top of a VM which is shared by other containers but it’s resources, memory, file system mounts and networking are completely isolated from other containers. Cloud foundry leverages Linux cgroups to do this which is a mechanism Linux uses to isolate things like memory and networking using namespaces. Because a container doesn’t have to include a complete operating system and can use the same underlying kernal resources as other containers on a VM, it is very fast to spin up. A container is built in layers. Things that all containers need like the base OS file system are contained in a read-only layer that each container can share. Things that are specific to a container are contained in its read-write layer which is not shared. Its worth noting too that cloud foundry also supports Windows containers. A container is deployed to a Diego Cell using the cell’s Garden API. This API is an implementation of the Open Container Initiative, thus allowing Docker containers to be deployed to Diego Cells.
How are these garden containers orchestrated?
This is taken care of by cloud foundry’s Diego system. This is a bit like Kubernetes. It is a self-healing container management system. Containers run inside what are called Diego Cells. A Diego cell is a VM – a bit like an EC2 instance in AWS (and could well be running on an EC2 instance if your cloud foundry installation/platform is using AWS as its IAAS provider). There is a component called the Diego Brain. This holds what are called auctions using its Auctioneer component. Diego cells participate in these auctions so that apps/microservices/tasks can be evenly distributed among Diego cells. Cloud foundry also contains what is called a Bulletin Board System (BBS). This keeps track of the actual number of instances of each microservice and the desired number of instances of each so that discrepancies can be rectified through initiating a Diego Brain auction process.
So how do I get my microservice onto PCF?
As shown in a previous blog post, cloud foundry can be used through a command line interface. So to deploy a JVM based application, it’s as simple as running
cf push
That’s a bit too much magic for me. What actually happens when I do a cf push?
CLI commands such as this are sent as requests to the Cloud Controller component in Cloud Foundry (also called "CAPI"). The CAPI sends requests to Diego through the Bulletin Boards System. The CAPI component also contains a blobstore. This stores all sorts of things such as metadata about applications (name, desired instance count etc.), application pacakges, droplets, buildpacks and a buildpack cache (I’ll get to build packs in a sec). When the cf push
command sends its request to the CAPI, metadata about the application being pushed is uploaded. This is contained in what is called a manifest.yml file. It contains details such as app name, number of instances, memory, disc quota etc. Following this, an "Uploading" stage is initiaited followed by a "Staging" stage which is followed by the "Running" stage.
In the uploading stage, first there is a check to see if any of the application’s packages already exist in the blobstore to save us from having to upload more then we need to. After this check, the CAPI initiates an upload of the required application packages from the developer’s machine. These are combined with existing application packages in the blobstore to form an overall application package.
After the uploading stage, the CAPI sends a request to Diego to schedule the "Staging" process. Diego schedules this to run on a Diego Cell. Before going further, it is helpful to know a bit about buildpacks.
A buildpack is something that is run to put together all the runtime support and dependencies such as framework dependencies required to run an app/microservice. The result of this process is a Droplet which is essentially an image that can be instantiated in a Diego container. The buildpack for a microservice can be specified it’s manifest.yml file. If it is not specified, cloud foundry will go through a process to determine which buildpack should be used. It does this by cycling through potential buildpacks which it downloads or gets from the CAPI buildpack cache. It does this by running each buildpack’s "Detect" script to determine if it is a suitable buildpack for the appication being built.
For example, calling “`cf push“ on a JVM jar file will eventually trigger this staging process for the uploaded jar. The staging process will find the buildpack capable of packaging a JVM based application. This buildpack will package up the required JVM runtime and other required dependencies like DB drivers etc.
After the Droplet is created from the buildpack, it is stored in the CAPI cache. The CAPI is notified that the Droplet is available so it then instructs Diego to run the application by instantiating Droplet as a container.
How do requests get into and responses come out of my microservice on PCF?
This happens through the Cloud Foundry component called the "Go Router". It routes requests coming in to applications running in containers on Diego cells or to Cloud Foundry components themselves. When a microservice is deployed to PCF, its route can be specified in its associated manifest.yml file or PCF can assign a random route. Also further routes can later be created and mapped or unmapped to or from applications. This enables blue-green deployments. This is where a temporary route can be mapped to a new version of a microservice. When we are happy that this new version is behaving and performing correctly, we can map the real route to this new version.
But can I restrict access to my services and restrict data coming from them?
Similar to AWS security groups and Network Access Control groups, application security groups can be set up to restrict the ports, ip address ranges and protocols that can be used to send data from a microservice to the outside world. Cloud Foundry also has a User Account and Authentication Service (UAA) which is an OAuth token issuer for microservices and other components to talk to each other. Also, PCF organizes user roles, resourecs, applicaiton deployments into scoped logical units call Orgs and Spaces.
What are Orgs and Spaces all about?
These are how Cloud foundry scopes things like user roles, application deployments, services and resources. There are two levels of scoping: the org level and the space level. You can choose how you want to model your orgs and spaces. For example, your orgs could correspond to different product offerings and each space inside an org could correspond to an environment such as dev, QA, prod etc. When logging into PCF using the CLI, you can specify which org and space combination you are targeting and can change at any point to target a different org-space combination. Users are created and scoped to orgs and spaces and have different roles. A whislestop tour of these roles would be:
- Org Manager. This person can add users and their permissions at the org level; she/he can also add private domains and add or remove spaces among other capabilities.
- Org Billing Manager and Org Auditors. These roles give read-only access to view things like application statuses and org quotas.
- Space Manager. This person can add users and permissions at the space level among other capabilites.
- Space Developer. This person can do things like deploy applications, create services, bind services to applications and bind routes to applications etc.
You mention services there, what are they?
Cloud foundry allows services to be created from service brokers. A service broker is a factory from which a service instance is created e.g a MySql db instance. There are different service brokers for all sorts of services e.g. MongoDB, MySql, Redis etc. These are available on the Cloud Foundry Marketplace. Each service broker specifies different plans – for example some have a free version with restrictions. It is also possible to create your own service broker. Service instances can easily be mapped to microservices on PCF. It’s as simple as running the cf bind-service
command. A service instance is scoped by an org-space combination.
How can I see what’s going on with my microservices on PCF?
Each Diego cell has what is called a Cell Rep. This component streams output from sysout and syserr along with metrics to another Cloud Foundry component called the Metron Agent. This then passes the log event stream and metrics onto the Loggregator component. Logs and metrics can be retrieved from Loggregater through the CLI command cf logs
but, more powerfully, Loggregator aggregates these and exposes them through its "Firehose". Third party applications can plug into the "Firehouse" using what are called "Nozzles". This is how very powerful monitoring tools such as Datadog can plug into the log/metric event stream coming from your microservices deployed on PCF.
Conclusion
I hope this has given an overview of what PCF is all about. There is a lot more to PCF. When used with a powerful ecosystem such as Spring Cloud in the JVM space or .Net Core in the .NET space, it can greatly simplify the delivery and monitoring of microservices that follow the 12 Factor App Heuristics.
Thanks for this great post on PCF Tom, it’s a great reference article and very easy to follow for non-engineers like myself.
Do you know if PCF is highly adopted by tech companies across different sectors?
Is it aimed at Startups, SMB, or Enterprise?
LikeLike
Hi Macdara, thanks a million for your comment. I’ve no stats on this but, from what I can see, I don’t think it as widely adopted as infrastructure as a service platforms like AWS. PCF is agnostic to underlying infrastructure as a service platforms that it runs on but, from what I can see, there seems to be a lot more companies using AWS or other infrastructure as a service platforms like GCP directly. I don’t think PCF is aimed at certain company sizes – its very useful for companies and projects of all sizes.
Thanks again for the comment.
Tom
LikeLike