NTP – Network Time Protocol

In the ever-evolving landscape of technology, precision in timekeeping is the silent force that synchronizes the digital world. Behind the scenes of our daily digital interactions lies a network of intricate systems working tirelessly to ensure that every device, every transaction, and every communication is precisely timed. At the heart of this network is NTP – the Network Time Protocol.

Origins and Innovator

NTP was conceived in the early 1980s by Dr. David L. Mills, a visionary computer scientist and professor at the University of Delaware. His pioneering work in the field of computer networking laid the foundation for modern time synchronization protocols. Dr. Mills envisioned NTP as a solution to the challenges of accurately maintaining time across distributed networks. Dr. Mills passed away on January 17, 2024 at the age of 85.

Satellites and Precision

Satellites play a crucial role in NTP by providing a reliable and precise time reference. GPS satellites, with their atomic clocks and synchronized signals, serve as an indispensable source for accurate timekeeping. NTP receivers utilize these signals to synchronize their internal clocks, ensuring precise timekeeping even in remote locations. This enables users to determine the time to within 100 billionths of a second.

Implementation and Open Source

NTP’s design and implementation are open source, fostering collaboration and innovation within the community. Popular implementations like the classic NTP reference implementation and the newer Chrony offer robust features and optimizations for various use cases. Let’s delve into some code snippets to understand how NTP can be used in languages like C++ and Rust.

C++ Project on Github

https://github.com/plusangel/NTP-client/blob/master/src/ntp_client.cpp

Rust Project on Github

https://github.com/pendulum-project/ntpd-rs/blob/main/ntpd/src/ctl.rs

Device Integration and Stratums

Devices across the spectrum, from personal computers to critical infrastructure, rely on NTP for time synchronization. NTP organizes time sources into strata, where lower strata represent higher accuracy and reliability. Primary servers, directly synchronized to authoritative sources like atomic clocks, reside at the lowest stratum, providing precise time to secondary servers and devices.

NTP server stratum hierarchy

Image Credit : Linux Screen shots . License info

Comparison and Adoption

Compared to other time synchronization protocols like Precision Time Protocol (PTP) and Simple Network Time Protocol (SNTP), NTP stands out for its wide adoption, versatility, and robustness. While PTP offers nanosecond-level precision suitable for high-performance applications, NTP remains the go-to choice for general-purpose time synchronization due to its simplicity and compatibility.

Corporate Giants and NTP Servers

Large companies like Google, Microsoft, and Amazon operate their own NTP servers to ensure precise timekeeping across their global infrastructure. These servers, synchronized to authoritative time sources, serve as beacons of accuracy for millions of devices and services worldwide.

Time for Reflection: The Importance of NTP

Imagine a world without NTP – a world where digital transactions fail, communication breaks down, and critical systems falter due to desynchronized clocks. NTP’s absence would plunge us into chaos, highlighting its indispensable role in modern technology.

An interesting and real scenario where NTP is absent or not accurate which happens at higher strata clock – Imagine two machines, m1 and m2 are exchanging information. Their clocks are not in sync. m1 shows 10:05 am and m2 shows 10:00 am. Now m1 sends some data to m2. If I were to calculate the finite time it took to send this payload then it will be a negative number!

In conclusion, NTP stands as a testament to human ingenuity, enabling seamless synchronization across the digital realm. From its humble origins to its ubiquitous presence in our daily lives, NTP continues to shape the interconnected world we inhabit. So, the next time you glance at your device’s clock, remember the silent guardian working tirelessly behind the scenes – the Network Time Protocol.

NTP stratum 1 servers in the form of robots getting time data from satellites

References:

  • Mills, D. L. (1991). Internet time synchronization: the network time protocol. IEEE Transactions on Communications, 39(10), 1482-1493.
  • Nelson, R., & Mills, D. L. (2010). Chrony: A Different NTP Client. USENIX Annual Technical Conference (USENIX ATC), 175-186.

My car has a digital twin !

We all have heard about people having a twin. But what if I told you that my car has a digital twin ! Let us understand Embodied AI in Autonomous driving and car’s digital twin.

Embodied AI is at the forefront of transforming the landscape of autonomous and self-driving cars, paving the way for safer roads and enhanced transportation systems. But what exactly is Embodied AI, and how does it revolutionize the realm of autonomous driving?

Embodied AI refers to the integration of artificial intelligence within physical systems, enabling them to perceive, interpret, and interact with the surrounding environment in real-time. In the context of autonomous vehicles, it entails equipping cars with sophisticated sensors, actuators, and intelligent algorithms to navigate roads autonomously while ensuring safety and efficiency.

Now, let’s delve into the fascinating realm of digitization or digital twin technology and its pivotal role in advancing autonomous driving:

🔍 Digitization and Digital Twin of Cars:

Digitization involves creating a virtual representation of physical objects or systems. In the case of cars, this entails developing a digital twin—a highly detailed, dynamic model that mirrors the behavior, characteristics, and functionality of its real-world counterpart. By continuously syncing data between the physical vehicle and its digital twin, automakers and AI engineers can:

Image Credit : NXP

  1. Enhance Training and Testing: Digital twins serve as invaluable tools for training AI algorithms and conducting extensive simulations in a safe, controlled environment. This enables developers to expose autonomous systems to a myriad of complex scenarios, including rare edge cases and adverse weather conditions, which are crucial for refining their decision-making capabilities.
  2. Iterative Development: Through iterative refinement and optimization, digital twins facilitate the rapid prototyping and iteration of autonomous driving systems. Engineers can simulate various design modifications and algorithmic enhancements, accelerating the development cycle and reducing time-to-market.
  3. Predictive Maintenance: By leveraging real-time sensor data and predictive analytics, digital twins enable proactive maintenance and diagnostics, thereby minimizing downtime and optimizing the operational efficiency of autonomous fleets.

🛣️ Predicting and Comparing On-Road Performance through Off-Road Simulation:

One of the greatest challenges in autonomous driving lies in accurately predicting and comparing on-road performance across different driving conditions. Here’s how off-road simulation powered by digital twins addresses this challenge:

  1. Scenario Generation: Off-road simulation platforms leverage digital twins to generate diverse and realistic driving scenarios, encompassing a wide spectrum of environmental factors, traffic conditions, and pedestrian behaviors. By meticulously crafting these scenarios, developers can assess the robustness and adaptability of autonomous systems under various challenging conditions.
  2. Performance Benchmarking: Through off-road simulation, developers can systematically benchmark the performance of different autonomous driving algorithms and sensor configurations. By quantitatively evaluating metrics such as safety, efficiency, and comfort across diverse scenarios, stakeholders can make informed decisions regarding technology integration and deployment strategies.
  3. Continuous Learning and Improvement: Off-road simulation serves as a continuous learning loop, wherein insights gleaned from simulated scenarios inform the iterative refinement of AI algorithms and sensor fusion techniques. By iteratively exposing autonomous systems to increasingly complex and diverse challenges, developers can enhance their resilience and reliability over time.

In conclusion, Embodied AI, coupled with digitization and off-road simulation, heralds a new era of innovation in autonomous driving, promising safer roads, enhanced mobility, and unprecedented levels of efficiency. As we continue to push the boundaries of technological advancement, let us harness the power of AI to shape a future where transportation is not just autonomous but truly intelligent. 🌐🚀 #AutonomousDriving #EmbodiedAI #DigitalTwin #Innovation #FutureofMobility

Traffic simulation

Image Credit : Zhou, Zewei et al. “A comprehensive study of speed prediction in transportation system: From vehicle to traffic.” iScience 25 (2022): n. pag.

License info about article containing above image : https://creativecommons.org/licenses/by/4.0/

How to (or not to) hire

Across multiple years of my experience leading teams I have made enough mistakes to learn how to (or not to) hire. I remember a few years ago during downtown ( which I have experienced many in the Bay Area)  trying to hire a manager for one of the engineering teams.  Now there is no scarcity of real good and accomplished people seeking opportunities especially during the downtown. There are times  when their experience might seem as if perfectly similar to your requirement. I put out a job req  and started getting multiple resumes so much so that I could not keep up.  There were a lot of good resumes of candidates with great experience although not perfectly fit it seemed that they could more than easily accomplish the task.

 My manager asked me to shortlist 10 resumes  and start talking with these people.  I was probably too naive – I requested couple of more weeks from my manager to receive a more suitable resume to come through.  My manager  grudgingly agreed cautioning me not to extend the process too long as the current lot of resumes were quite good already.  One week passed  and that perfect resume never came.  I frantically contacted people in my network and passed on the job req.  I also reached out to two ex-colleagues I thought were a good fit for the job.  Turned out one of them was interested in talking more as the person was looking for change. 

The person agreed to come for  interview after 2 weeks ( guess s/he was preparing for the interviews ! ). This candidate performed quite well and everybody gave good reviews. We extended an offer to this candidate. And we were countered.  The HR was blindsided and did not expect a big difference in the counter offer.  Some of the information this candidate requested was really smart, eye opening and legitimate.  The candidate was doing his/her job vetting the company,  it’s culture,  it’s position in the market, it’s long term road map and much more.  In the end the  candidate did not accept our offer and joined a different company.

 We were back to square one and I was back looking at old resumes.  Most of those candidates had moved on and we had to restart the whole process.  My manager was not happy, however he asked me to take this setback in stride. This is when he explained to me about how to (or not to) hire. Here is the best recollection from that discussion.

  • Interviewing only a single candidate for a job is lowering your bar. Minimum 4 to 5 (some companies have a higher limit to number of people they interview )  candidates should go through the full loop. This helps company understand the current candidate pool in industry and get the best out of available lot. This also gives most candidates  a fair chance which they very well deserve.
  • Some of the best candidates are not just interviewing at the company, but also interviewing the company.  Therefore, it is not necessary that in the end the person will join if an offer is extended. 
  • There is no such thing as a perfect candidate. They more you delay searching for a perfect candidate or wait for somebody in your network to be available, the more time is wasted and some of the good candidates from current candidate pool may not be available when it’s too late. You may have to settle for less fitting candidate. Thus a late hire affects the company’s bottom line.
  • A lot of hiring happens based on references and knowing people and ex colleagues in the industry. Both candidates and companies should take advantage of this connections. The referred candidates should go through the same loop as others.

And lastly having been on both the sides of the interviewing desk I do understand why companies are hesitant in providing feedbacks when a candidate is rejected. I do however think that it would be a good thing to provide some form of feedback to the candidate. If anybody has succeeded in doing this the right way please let us know in the comments.

Other interesting reads

IPv6 – NDP, SLAAC and static routing

Engineers working on network deployment, maintenance and debugging may feel like being caught in the endless journey transcending between various realms. Fret not, IPv6 deployments are getting easier, and more help is coming !  In the mean time lets understand what is NDP, SLAAC and static routing in IPv6.

Solicited-node multicast address

  • Generated by using the last 6 hex characters from IPv6 address and append it to ff02::1:ff
    • E.g. For a unicast address 2001:1bd9:0000:0002:1d2a:5adf:ae3a:1d00c, the solicited-node multicast address is ff02:0000:0000:0000:0000:0001:ff3a:1d00c

NDP

  • NDP (Neighbor Discovery Protocol) in IPv6 has various functions, one of them is to replace ARP (Address Resolution Protocol) used in IPv4 networks to get the MAC address of a node in the network from the IP address of the node.
    • NDP uses ICMPv6 and solicited-node multicast address for ARP function
    • Unlike ARP, NDP does not use broadcast
    • Two messages are used
      • NS (Neighbor solicitation) = ICMPv6 Type 135
      • NA (Neighbor Advertisement) = ICMPv6 Type 136
  • Instead of ARP table, IPv6 neighbor table is maintained
  • NDP also allows hosts to discover routers on local networks.
    • RS (Router Solicitation) = ICMPv6 Type 133
      • sent to address FF02::2 (routers multicast group)
      • Sent when interface is (re)enabled
    • RA (Router Advertisement) = ICMPv6 Type 134
      • Sent to address FF02::1 (all nodes multicast group) as reply to RS and periodically

SLAAC

  • SLAAC (Stateless Address Auto-configuration) – one of the  ways to configure IPv6 address
    • Node uses RS/RA messages to learn the IPv6 local link prefix
    • Interface ID is then generated using EUI-64 or randomly
An Engineer working on solving a complex IPv6 networking problem about NDP, SLAAC and static routing with a tiger and starwars soldiers guarding

DAD

  • DAD (Duplicate Address Detection) – a function of NDP which a node uses before an IPv6 address is configured on its interface to check if any other node has the same IPv6 address
    • Host sends NS to its own IPv6 address. If there is a reply, it means there is a host with the same address and therefore the host cannot use this IPv6 address.

IPv6 static routing

  • Directly attached static route: Only exit interface is mentioned. Used for point-to-point link that do not need next-hop resolution. Broadcast network like Ethernet not allowed
  • Recursive static route: Only next hop IPv6 address is specified
  • Fully specified static route: Both exit interface and next hop are specified.

Types of IPv6 addresses

IPv6 is gaining traction and is starting to be deployed rapidly. Adding fuel to this fire is advancements in AI / ML , AR / VR, financial markets and other technologies. Below is accurate and exhaustive list of different types of IPv6 addresses and its use

  1. Global Unicast
    • Globally unique public IPv6 addresses usable over the internet.
    • Here are the Global Unicast IPv6 address assignments
  2. Unique local
    • Private IPv6 addresses not usable over the internet (ISP will drop packets)
  3. Link local
    • Automatically generated IPv6 address when the network interface is enabled with IPv6 support.
    • Address starts with FE8 and then interface ID is generated using EUI-64
    • Used for communication within a subnet like OSPF LSAs, next hop for static routes and NDP
  4. Anycast
    • Any global unicast or unique local IPv6 address can be designated as Anycast address
  5. Multicast
    • Address block FF00::/8 used for multicast in IPv6
    • Multicast address scopes
      • Interface-local
      • Link-local
      • Site-local
      • Organization-local
      • Global
  6. EUI64
    • EUI = Extended Unique Identifier. This method allows automatic generation of IPv6 address using MAC address
    • EUI-64 is a method of converting a 48bit MAC address into 64 bit interface identifier
      • Divide MAC address at the midpoint — e.g 1234 5678 90AB can be divided in to 123456 | 7890AB
      • Insert FFFE in middle — 1234 56FF FE78 90AB
      • Invert the 7th bit from the most significant side  — 1234 56FF FE78 90AB becomes 1034 56FF FE78 90AB
    • This 64 bit interface identifier is then used as host portion of a /64 IPv6 address by adding it on to the 64 bit network prefix making a 128bit IPv6 address
  7. :: (two colons)
    • Same as IPv4 0.0.0.0
  8. ::1. (loopback)
    • Same as IPv4 127.0.0.0/8 address range. IPv6 only uses a single address for loopback unlike IPv4

Types of IPv6 addresses

What is Null and Alternative Hypothesis


Null and Alternative Hypothesis is used extensively in Machine Learning. Before we answer what is null and alternative hypothesis, let us understand what is Hypothesis Testing.

Hypothesis Testing is used to assess if the difference between samples taken from population are representative of actual difference between populations themselves. 

Now why do we even conduct hypothesis testing? Suppose we are comparing the efficacy between two different exercises on 10 patients who underwent the same kind and complexity of knee replacement surgery. This should not be too difficult to do with real data. E.g. 5 patients are asked to do exercise1 and other 5 are asked to do exercise 2,  15 mins a day for 1 month after surgery. After a month, they are tested for the angle at which they can bend their knee. This comparison between patients is not difficult to make.

Now let us imagine if the same comparison is between two groups of patients from two different hospitals. The comparison quickly becomes unwieldy and introduces multiple random events which can easily affect the data. E.g. some patients do exercise after a shower, or after food, or some patients were on different medication which affected their musculoskeletal system etc. Now imagine if the comparison is not across these two groups, but across the population of the whole state of California. Here it is extremely difficult, if not impossible, to compare every single patient in the state of California. Lets park this thought for a second.

Now what is Null Hypothesis (H0):

Null hypothesis H0 is also described as “no difference” hypothesis. i.e. There is no difference between sample sets from different populations. Here we mostly see an equality relationship.

So for the example above about sample of patients from state of California, we start by assuming that null hypothesis H0 is true. i.e. all the samples from different population set are same and that there is no difference between them. We then calculate the probability of such an event (occurrence of null-hypothesis) using the a number between 0 and 1 called as p-value. Generally a value less than 0.05 is considered low probability and therefore we say that chance of having null-hypothesis is very less. Thus we reject null-hypothesis in that scenario. For a p-value greater than 0.05 we fail to reject null-hypothesis (Basically a roundabout way of saying that null-hypothesis is true). For p<0.05, when null-hypothesis is rejected, an alternative hypothesis must be adopted in its place.

What is Alternative Hypothesis (Ha) :

Alternative hypothesis is a scenario in which if you have a new claim against the default hypothesis using current data and therefore it is sort of a new news and requires data to back up this claim. You basically say that you disagree with the data at hand and that there is something new happening. The alternative hypothesis does not have a statement of equality and also uses data from pre-established success rate given in the problem statement.

In the above example of efficacy of exercise comparison for patients, a p-value < 0.05 will reject null-hypothesis and thus an alternative hypothesis takes place. From hereon, we get an opportunity to dig more deeper into the available data and see how the sample sets are different ( Remember, if null-hypothesis was true then we would have concluded that sample data are all the same and that both exercises help patients equally. Only after p-value being less than 0.05 we rejected null-hypothesis ) . We can use various statistical calculations e.g. mean angle of knee bent amongst people of exercise1 set vs mean angle of knee bent amongst people of exercise2 set. This may help us determine which exercise turned out to be better.

Patient number 2 doing exercise number 2 after knee replacement surgery. Example for what is null and alternative hypothesis


Why do we use null-hypothesis? 

Null-hypothesis seems to be an extremely simple way to start. And that is exactly what we want in statistical inference. A starting point. Null hypothesis provides an easy starting point. It is easy to describe our expectation from data when null hypothesis exist. An then by using p-value we land on to alternative hypothesis, if it is available or inferred using the data. Null and alternative hypothesis are mutually exclusive.

Bibliography:

Supervised Machine Learning for Beginners

Welcome. You are in the right place if you are just starting your journey learning Machine Learning. I found my very old notes / cheat sheet about Supervised Machine Learning for beginners when I started learning ML a long time ago. Here is the link to a high resolution pdf if you are interested.

A little primer to Supervised Machine Learning follows. Read my attached original notes for details

Supervised Learning is a field of Machine Learning that learns from examples / known outputs y and is then able to predict y from other newer values of x
Linear Regression Classification
Algo predicts an output value y from infinite possible values for a given input xDefinition : Algo predicts finite outputs / categories for a given input
Model function : 
f(x) = wx +b
Is basically a function for straight line
Model function for Logistic Regression : 
f(x) = g(z) = 1/ (1 + e-z )
Is a function describing a sigmoid 
Graph of the function
Graph of Sigmoid function

Convex plot of Cost function
Supervised Machine Learning handwritten cheat sheet

Now that you have notes on Supervised Machine Learning for beginners, other foundational ML topic you may be interested in is Null and Alternate Hypothesis. Happy reading !