A matter of time: Ensuring precise time and synchronization for critical infrastructure

Critical infrastructure services such as telecommunications, utilities, transportation and defense are of national strategic importance. The U.S. Cybersecurity and Infrastructure Security Agency (CISA) lists 16 such sectors considered vital for security. Presidential Policy Directive 21 (PPD-21): Critical Infrastructure Security and Resilience advances a national policy to strengthen and maintain secure, functioning and resilient critical infrastructure.

Together, positioning, navigation and timing (PNT) are necessary for the functioning of a nation’s critical infrastructure. However, ubiquitous use of GPS as the primary source of PNT information introduces vulnerabilities. CISA, through the National Risk Management Center, works with government and industry partners alike to strengthen the security and resiliency of the national PNT ecosystem in the U.S. In early 2020, Executive Order (E.O.) 13905 on Strengthening National Resilience through Responsible Use of Positioning, Navigation, and Timing (PNT) Services was signed to strengthen, through policy promotion, the responsible use of PNT services by government and infrastructure operators.

The following is a review of cost considerations and exploration of the three key elements for critical infrastructure that help to strengthen PNT, focused on synchronization and precise timing: redundancy, resiliency and security.

Evaluating Cost and Location

It is often hard for operators to justify the resiliency, redundancy and security costs associated with deploying these capabilities at every layer of the architecture. New timing and synchronization solutions and design choices are leading to the right cost structures to deliver robust and reliable solutions.

The dilemma between cost and solution type is typically related to which deployment location is considered. With the evolution of technologies such as the migration from SDH/TDM to Ethernet and the development of LTE/4G and 5G in mobile, the number of aggregation offices and, above all, of network access sites at the edge has exploded. This inevitably leads to devices becoming much smaller, typically 1U-rack mountable devices, and with a cost in line with the much smaller size of edge base stations (small cells and gNodeBs).

Operators are left with the question: What is the best way to provide redundancy, resiliency and security in this environment? There are two core levels to consider — the architecture level and design level.

Exploring Redundancy

Redundancy at the architecture level can be engineered with core functions at both ends of a deployment (east/west) with dual paths for directional redundancy and high-performance capabilities for efficient high-accuracy time transfer over the long haul for cost-effective distribution. The virtual Primary Time Reference clock (vPRTC) architecture is such an architecture-level solution.

Redundancy can also be considered in the device itself, where the design choices are critical. Smaller devices cannot realistically be cost-effectively designed with modular hardware redundancy. The innovation here is to offer software redundancy, so a distributed, low cost, efficient and high-performance distributed solution can be deployed. A hardware module is typically expensive for two reasons: cost, and because the redundant module takes the space of another module, typically for input and output ports.

Hardware module redundancy often leads to a tradeoff between adding redundancy and losing capabilities, such as a choice between 10-gigabyte Ethernet (GE) support or multi-band GNSS or other compromises if redundancy is enabled. On the other hand, with software redundancy no tradeoff is necessary. Redundancy can be introduced while preserving all existing capabilities; no inputs or outputs are eliminated, no multi-band GNSS capability is eliminated. Redundancy is introduced via a software upgrade; therefore, it does not remove any hardware. Hardware redundancy, however, means duplicating an existing module with a similar module inside the device; this new module takes the slot of an existing module, and the function of that existing module is lost when it is removed from the unit.

Figure 1 depicts a commonly deployed redundancy use case with two aggregation routers using virtual router redundancy protocol (VRRP).

Figure 1. Example of redundancy connectivity between the active and standby units. (Image: Microchip)

Software redundancy is a dual-unit scheme based on two reasonably priced devices, one active and the other on standby. It is more cost-effective for two reasons. First, it does not involve a costly device design with expensive hardware modules. Second, each unit (passive and active) keeps all of its capabilities compared to a hardware redundant design, which involves duplication of modules in the device, thus reducing the existing possible capabilities to host the redundant module.

Software redundancy provides total redundancy of the whole device because the active and standby units are the same. One hundred percent of the capabilities are redundant, including oscillator, GNSS receiver, ports and input/outputs. A hardware module is only redundant for its own features, not the rest of the unit.

Leveraging Resiliency

Resiliency at the architecture level is key to engineering the network so grandmasters in the deployment can be connected to each other. Some grandmasters are connected to GNSS as their source of time and frequency. It is key to connect these systems to other 1588 grandmasters to enable assisted partial time support (APTS) and to leverage key innovation such as automatic asymmetry correction (AAC).

AAC is a key (patented) differentiator in a resilient design that enables calibration of the different paths a PTP flow may use to/from upstream grandmasters, thus allowing for a backup in case GNSS fails at the location of a grandmaster. A backup path to an upstream grandmaster can guarantee uninterrupted and precise time and phase operation. This architecture makes sure that GNSS can be backed up by IEEE 1588 Precision Time Protocol (PTP) when GNSS is interrupted, with the best path being utilized.

The alternative architecture choice is virtual PRTC (vPRTC), which enables operators to leverage redundancy and resiliency via a chain of high-performance boundary clocks using PTP over long distances for high accuracy, typically over optical networks. This architecture reduces reliance on GNSS and uses PTP as its primary source of time and phase.

Figure 2 depicts an optical network deployment with a dedicated optical timing channel (OTC) for high-accuracy distribution of phase over long distances.

Figure 2. Optical network deployment with OTC. (Image: Microchip)

Resiliency at the device level starts with the right choice of an oscillator, from OCXO to atomic clock (Rubidium) — and is dependent on the location, use case and respective requirements for timekeeping holdover performance. Also, the choice of GNSS receiver is key. Some typically support a single frequency, yet ionospheric phenomenon can create significant time delays during cyclical events such as solar storms. To mitigate such delays, a multi-band GNSS receiver is required.

Figure 3 depicts a comparison between single-band and multi-band time delays due to ionospheric effects and shows how multiband clearly mitigates the time error as highlighted in red.

Figure 3. Comparative ionosphere phenomenon. Source: https://www.gsc-Europa.eu/system/files/galileo_documents/Galileo-OS-SDD.pdf. (Image: Microchip)

GNSS satellites transmit time information in several frequency bands. The delay difference between signals at different frequencies provides information about ionospheric impact on the absolute delay. This enables multi-band GNSS receivers to compensate for delay variations of radio signals transmitted from the satellite to the receiver. Embedding a multi-band receiver mitigates these time delays, which is critical for applications requiring Primary Reference Time Clock class B (PRTC-B),40 ns, as well as enhanced PRTC (ePRTC) 30 ns.

These device design choices are equally important. The GNSS receiver can be embedded inside the unit on the main board, or it can be offered as a hardware module, often at an additional cost, and may impact and replace an existing module that needs to be ripped and replaced. It may be preferable to have the unit enabled with a multi-band receiver and have the multi-band capability turned on via a license as opposed to offering a multiband option on a hardware module, as this becomes a tradeoff with other important capabilities.

Evaluating Security

Security is of utmost importance. Authentication and authorization via standard mechanisms such as Terminal Access Controller Access Control System + (TACACS+) and Remote Authentication Dial-In User Service (RADIUS) provide the benefit of a standard security framework. In addition, two-factor authentication (2FA) is an extra layer of protection used to ensure the security of accounts beyond just a username and password.

Also, it is key to provide Secure Shell (SSH) extensions with various levels of security profiles to offer more granularity for the types of users and related access rights and limitations. Offering high-security profiles provides for the definition and enforcement of the most stringent access rules to the system. Scripting vulnerabilities and relevant Common Vulnerabilities and Exposures (CVE) need to be addressed to make sure all potential security holes are being reviewed and addressed.

Plus, evolving jamming and spoofing threats need to be part of the precise time security strategy and implementation via monitoring of signals and consistency checks and remediation. Automatic gain control (AGC) and other metrics can be leveraged to provide thresholds with interpretation of results, as well as mitigation actions when encountered.

Final Decision Making

To ensure continued performance, it is critical to make the right architecture choices. A thorough network engineering study should include the locations where grandmaster units need to be deployed and their performance and accuracy requirements. These steps will guide which types of precise time and synchronization devices need to be selected

In addition, network planners and synchronization engineers should pay careful attention to design choices such as fanless devices versus devices that require a fan, modular hardware redundancy versus software redundancy, and the related advantages in terms of cost and tradeoffs — as well as similar choices regarding embedded or modular GNSS.
These choices can lead critical infrastructure operators to deploy redundancy, resiliency and security at all layers.

For architecture choices and solutions, visit vPRTC..White papers on this topic and others are also available. Additional information on devices and redundancy software schema is here.

Eric Colard is head of Emerging Products, Frequency & Time Systems at Microchip. He leads the product line management for Microchip’s TimeProvider 4100 and Integrated GNSS Master solutions for the telecom, utility and other industries.

Evaluating Cost and Location

Exploring Redundancy

Leveraging Resiliency

Evaluating Security

Final Decision Making

Dela detta:

Gilla detta: