Device Clock Generation

Published: (June 12, 2026 at 12:24 AM EDT)
10 min read

Source: Hacker News

After building a CPU, utilities for handling bus interconnects, several DMAs and memory controllers, I often find my time focused on building interfaces between designs and external peripherals. This seems to be where most of the business has landed for me. Often, these peripherals require a clock output, coming from the design, and so I’d like to spend some time describing how to generate such a “device” clock.

Fig 1. A Basic SOC with Peripherals

There’s actually two topics that need to be discussed when working with modern high speed peripheral design. One of them is generating the clock to be sent to the peripheral, such as Fig. 1 above illustrates. The second one involves processing a clock returned from the peripheral, as shown in Fig. 2 below. This is a key component of high speed designs such as DDR memories, eMMC, HyperRAM, or even NAND flash protocols. This second topic is one we shall need to come back to at a later date.

Fig 2. Data returned with a clock

Today, I’d like to discuss how to go about generating a clock to control device interaction.

I first came across this problem when building a NOR flash controller, based on first a SPI interface and later a Quad SPI interface. My controller was designed for FPGAs, and so the clock could be built with a single frequency. This design had the added complication that the clock needed to be paused from time to time. Specifically, the clock needed to be turned off when nothing was going on. Likewise, the clock needed to be turned off for one cycle after dropping (i.e. activating) the chip select pin, and for a couple cycles after the transaction was complete but before raising (deactivating) the chip select.

I had to deal with a similar problem when controlling a HyperRAM, but … that design failed when I wasn’t (yet) prepared to handle the return clock properly. I did say this deserved an article in its own right, did I not? Processing data on a return clock properly can be a challenge.

I then built a similar design for ASIC platforms. Unlike the FPGA, the final clock speed wouldn’t be known until run time. It might be that the design started at a slower clock speed, only to later speed up to the full rate at run time. Unlike an FPGA which can be fixed later, there’s really no room for failure in ASIC work. At least with an FPGA, if my board didn’t support a particular frequency, I could just rebuild the design for the clock frequency it did support. This doesn’t work, though, for an ASIC–since it tends to be cost prohibitive to rebuild the design at a later time when you decide to connect it to a slower part than the one you designed it for.

The next design I worked with was a NAND flash design. NAND flash can be a challenge, since the protocol requires you to start at a slow frequency and only after you bring up the connection are you allowed to change to a faster frequency. This particular design was built for ASIC environments, and so it depended upon an analog component generating all the clocks I needed. This worked great, up until someone wanted to purchase the design to work on an FPGA, then another wanted it to work on an FPGA, and another and so on.

Fig 3. Single Data Rate (SDR) vs Dual Data Rate (DDR)SDRDDR

Just to add another twist to the problem, many protocols require data transitions on both edges of the clock, a protocol often known as “Dual Data Rate” (DDR). Unlike the other designs above, these often require a clock that is 90 degrees offset from the data–so that each clock transition takes place in the middle of each data valid window, rather than on the edges of the window. This sort of “offset” clock is necessary to guarantee setup and hold times within the slave peripheral. An example of the clock and data relationship required by DDR as opposed to a traditional “single data rate” (SDR) clock is shown in Fig. 3.

By the time I got to my SDIO/eMMC controller, I think I finally had the clock division problem handled. An SDIO controller needs bring up the SD card at 400kHz, and then depending upon the card, the PCB, and the controller, the speed may then be raised to 25MHz, 50MHz, 100MHz, or even 200MHz. The clock may also be stopped whenever either there’s nothing to send or receive, or when the SOC can’t load or unload the data to the controller. For example, you might ask an SD card to read and thus produce many blocks of data, then read the first two of these blocks into your internal buffers only to find that the CPU is slow in draining those buffers. In that case, you would need to stop the interface clock before the external card tries to send you a third block of data that would have nowhere to go.

Other devices require user programmable device clock controllers, such as:

10M/100M/1Gb Ethernet controllers

While each of these speeds might use a single clock, building a truly

trimode controller requires some extra work.

(DDR) SDRAM controllers

SDRAM controllers from an FPGA standpoint tend to be simple: just produce a

clock. However, you can turn the clock off for better power performance. Yes, there are rules … but we won’t get into those here today.

I2S

We discussed generating an I2S clock at a totally arbitrary

frequency some time ago.

I2C

In general, I2C is too slow to be the focus of this article.  There is

an I3C protocol that is built on top of I2C. The techniques we discuss today might work well for I3C masters, but I’m not nearly as familiar with those.

SPI – not just NOR flash

While SPI *slaves* have a device clock as well, handling these clocks is

fundamentally different from what I’m describing today. My focus today will be on generating clock signals for the purpose of controlling external devices–such as an SPI master might need to do.

Specifically, today I want to look at and discuss generating a clock with one or more of the following characteristics:

**Output Signal:** We’re talking about interface clocks–those generated by

the “master” of the interface. These are digital signals, output from either an FPGA (or ASIC) device.

The output may be accomplished via a component like an

ODDR or an OSERDES, with or without an additional analog delay following.

**Discontinuous:** The clock may be discontinuous.  Many protocols

(flash, SDIO/eMMC, etc) allow or even require, the clock to be stopped, or otherwise only toggled when there’s something to send or receive. As mentioned above, stopping the clock may also be useful for pausing a transmission in progress before a source buffer runs dry, or an incoming buffer overflows.

**Dynamic Frequency:** Often, the outgoing clock needs to change frequency

during operation as part of the protocol. For example, the SDIO protocol needs to start at 400kHz, and then increase to 25MHz (or more). Therefore, a good clock generator will need to be able to naturally generate multiple clock frequencies as the protocol requires.

**Minimum pulse width:** Switching between frequencies must be done by rule:

clock glitches must be fully disallowed and guaranteed against. Too-short clock pulses cannot be allowed. Clock high and low durations must always be at least a half period of the fastest allowable clock.

**90 Degree Offset for DDR Signaling:** As shown in Fig 3, many modern

protocols require both positive and negative edge signaling (DDR). This drops the required clock frequency by 2x, reducing the bandwidth that must be carried over the PCB for the same data rate. However, the clock signal required to support such DDR signaling often needs to be delayed 90 degrees from the data, so that it transitions in the middle of the data valid period.

**Faster than the controller’s clock:** Just to make matters worse, in my

eMMC design, I needed to generate a 200MHz DDR device clock from a 100MHz system clock.

All this is to say that our goal today will be to create a divided clock using digital, rather than analog, logic. (Yes, I can hear my analog engineering friends jump in here with the comment that “Everything is analog!” God bless you, my friends.)

The Problem

The first approach I often see to this problem is the straight forward integer clock division approach. Generally, it looks something like the following:

always @(posedge src_clk)
if (reset)
	counter = 3)
			assert(counter[NCTR-3:0]  2);	// 50MHz
		cover(cvr_spd == 3 &&  clk90 && cvr_count > 2);	// 25MHz
		cover(cvr_spd == 3 && !clk90 && cvr_count > 2);
		cover(cvr_spd == 4 &&  clk90 && cvr_count > 2);	// 12MHz
		cover(cvr_spd == 4 && !clk90 && cvr_count > 2);
		cover(cvr_spd == 5 &&  clk90 && cvr_count > 2);	//  8MHz
		cover(cvr_spd == 5 && !clk90 && cvr_count > 2);
		cover(cvr_spd == 6 &&  clk90 && cvr_count > 2); //  6MHz
		cover(cvr_spd == 6 && !clk90 && cvr_count > 2);
	end

We’ll have to handle covering the high speed options a bit differently. In this case, we only want to check speeds requiring OPT_SERDES if OPT_SERDES is actually checked. We can’t use an if for this, lest the formal tool decide we failed the cover check. Hence, we’ll use a generate statement, so that the cover statements requiring OPT_SERDES are only generated if OPT_SERDES is true. Now we can check for 200MHz, 100MHz, and 50MHz.

	generate if (OPT_SERDES)
	begin : CVR_SERDES

		always @(posedge i_clk)
		if (!i_reset)
		begin
			cover(cvr_spd == 0 &&  clk90 && cvr_count > 5);
			cover(cvr_spd == 1 &&  clk90 && cvr_count > 5);
			cover(cvr_spd == 1 && !clk90 && cvr_count > 5);
			cover(cvr_spd == 2 &&  clk90 && cvr_count > 5);
			cover(cvr_spd == 2 && !clk90 && cvr_count > 5);
		end

We can apply the same logic to OPT_DDR, but we’ll have fewer clock options to check. In this case, it’s only the 100MHz and 50MHz options.

	end else if (OPT_DDR)
	begin : CVR_DDR

		always @(posedge i_clk)
		if (!i_reset)
		begin
			cover(cvr_spd == 1 && !clk90 && cvr_count > 5);
			cover(cvr_spd == 2 &&  clk90 && cvr_count > 5);
			cover(cvr_spd == 2 && !clk90 && cvr_count > 5);
		end

	end endgenerate

By the time you get to this point, you should have a strong confidence that this device clock generator actually does what it needs to. I certainly do, and it hasn’t failed me (that I recall) since going through this exercise. Yes, other parts of this design have had problems, particularly the front end, but the clock generator has been quite reliable.

Conclusions

This is now my go-to approach whenever I need to generate a device clock:

Generate the “clock” in logic.

Generate the “clock” wide, so it can be output via either OSERDES or

ODDR.

Maintain all logic transitions on the original source clock.

Use logical signals like you would enables to handle data transitions.

What did this gain us? We received several advantages from this approach:

A glitchless outgoing clock

An outgoing clock that can …

change frequency upon command,

turn on and off as necessary,

stop, and yet restart on a dime, and

switch between being data aligned and offset by 90 degrees.

This is everything we would want of an outgoing clock, with none of the challenges associated with breaking the rules. Indeed, this approach works nicely in both FPGA and ASIC contexts, as I’ve now used it quite successfully in both for multiple projects. No, I don’t use the same clock generator for all my projects, but that’s for both requirements (the 200MHz clock is unique) and legal reasons.

This leaves us with the topic of the “return clock”, which we’ll need to come back to and discuss on another day.

0 views
Back to Blog

Related posts

Read more »

Chaosnet (1981)

1 Introduction ¶Introduction Chaosnet is a local network, that is, a system for communication among a group of computers located within one or two kilometers o...

Rome Fell and Nobody Noticed

When I first began learning about the Roman Empire in middle school, I was most interested in what everyone else seems to be interested in — the time of Caesar...