Skip to main content
Posts by:

Tim Booher

Playing with Matched Filters

During my time on the red team, we continually discussed the role of matched filters in everything from GPS to fire control radars. While I’m having a blast at DARPA where I work in cyber, I wanted to review an old topic and put MATLAB’s phased array toolbox to the test. (Yes, radar friends this is basic stuff. I’m mostly writing this to refresh my memory and remember how to code. Maybe a fellow-manager might find this review helpful, but if you are in this field, there won’t be anything interesting or new below.)

Why use matched filters?

Few things are more fundamental to RADAR performance than the fact that probability of detection increases with increasing signal to noise ratio (SNR). For a deterministic signal in white Gaussian noise (a good assumption as any regarding background noise but the noise does not need to be Gaussian for a matched filter to work), the SNR can be maximized at the receiver by using a filter matched to the signal.

One thing that always confused me about matched filters was that they really aren’t a type of filter, but more of a framework that aims to reduce the effect of noise which results in a higher signal to noise ratio. One way I’ve heard this described is that the matched filter is a time-reversed and conjugated version of the signal.

The math helps to understand what is going on here. In particular, I want to derive that the peak instantaneous signal power divided by the average noise power at the output of a matched filter is equal to twice the input signal energy divided by the input noise power, regardless of the waveform used by the radar.

Suppose we have some signal $r(t) = s(t) + n(t)$ where $n(t)$ is the noise and $s(t)$ is the signal. The signal is finite, with duration $T$ and let’s assume the noise is white gaussian noise with spectral height $N_0/2$. If the aggregated signal is input into a filter with impulse response $h(t)$ and the resultant output is $y(t)$ you can write the signal and noise outputs ($y_s$ and $y_n$) in the time domain:

$$ y_s(t) = \int_0^t s(u) h(t-u)\,du $$
$$ y_n(t) = \int_0^t n(u) h(t-u)\,du $$

Since we want to minimize the SNR, we expand the above:

$$\text{SNR} = \frac{y_s^2(t)}{E\left[y_n^2(t) \right]}$$
$$ = \frac{ \left[ \int_0^t s(u) h(t-u)\,du \right]^2}{\text{E}\left[ \int_0^t n(u) h(t-u)\,du \right]^2}$$

The denominator can be expanded:

$$\text{E} \left[y_n^2(t) \right] = \left[ \int_0^t n(u) h(t-u)\,du \int_0^t n(v) h(t-v)\,dv \right] $$

Or

$$ \int_0^t \int_0^t E [ n(u) n(v) ] h(t-u) h(t-v) du\,dv $$

We can further simplify this by invoking a standard white noise model:

$$ E[y_n^2] = \frac{N_0}{2} \int_0^t \int_0^t \delta(u-v) h(t-u) h(t – v) du\,dv $$

Which simplifies nicely to:

$$ \frac{N_0}{2} \int_0^t h^2 (t – u) du $$

Now all together we get:

$$ SNR = \frac{ \left[ \int_0^t s(u) h(t-u)\,du \right]^2 }{\frac{N_0}{2} \int_0^t h^2 (t – u) du } $$

In order to further simplify, we employ the Cauchy-Schwarz Inequality which says, for any two points (say $A$ and $B$) in a Hilbert space,

$$ \langle A, B \rangle^2 \leq |A|^2 |B|^2 \text{,}$$

and is only equal when $A = k\,B$ where $k$ is a constant. Applying this, we can then look at the numerator:

$$ \left| \int_0^t s(u)\,q(u) du \right|^2 \leq \int_0^t s^2(u) du \int_0^t q^2(u) du $$

and equality is acheived when $k\,s(u) = q(u)$.

If we pick $h(t-u)$ to be equal to $k\,s(u)$, we can write our optimal SNR as:

$$ SNR^{\text{opt}} (t) = \frac{k \left[ \, \int_0^t s^2 (u) duN \right]^2 }
{
\frac{N_0 k^2}{2}
\int_0^t s^2(u) du
} = \frac{\int_0^t s^2(u) du
}{
N_0/2
}$$

Since $s(t)$ always has a finite duration $T$, then SNR is maximized by setting $t=T$ which provides the well known formula:
$$SNR^{\text{opt}} = \frac{\int_0^T s^2(u) du}{N_0/2} = \frac{2 \epsilon}{N_0}$$

So, what can we do with matched filters?

Let’s look at an example that compares the results of matched filtering with and without spectrum weighting. (Spectrum weighting is often used with linear FM waveforms to reduce sidelobes.)

The most simple pulse compression technique I know is simply shifting the frequency linearly throughout the pulse. For those not familiar with pulse compression, a little review might be helpful. One fundamental issue in designing a good radar system is it’s capability to resolve small targets at long ranges with scant separation. This requires high energy, and the easiest way to do that is to transmit a longer pulse with enough energy to detect a small target at long range. However, a long pulse degrades range resolution. We can have our cake and eat it to if we encode a frequency change in the longer pulse. Hence, frequency or phase modulation of the signal is used to achieve a high range resolution when a long pulse is required.

The capabilities of short-pulse and high range resolution radar are significant. For example, high range resolution allows resolving more than one target with good accuracy in range without using angle information. Many other applications of short-pulse and high range resolution radar are clutter reduction, glint reduction, multipath resolution, target classification, and Doppler
tolerance.

The LFM pulse in particular has the advantage of greater bandwidth while keeping the pulse duration short and envelope constant. A constant envelope LFM pulse has an ambiguity function similar to that of the square pulse, except that it is skewed in the delay-Doppler plane. Slight Doppler mismatches for the LFM pulse do not change the general shape of the pulse and reduce the amplitude very little, but they do appear to shift the pulse in time.

Before going forward, I wanted to establish the math of an LFM pulse. With a center frequency of $f_0$ and chirp slope $b$, we have a simple expression for the intra-pulse frequency shift:

$$
\phi (t) = f_0 \, t + b\,t^2
$$

If you take the derivative of the phase function, instantaneous frequency is:

$$ \omega_i (t) = f_0 + 2\,b\,t. $$

For a chirp pulse width in the interval $[0, T_p]$, $\omega_i(0) = f_0$ is the minimum frequency and $\omega_i(T_P) = f_0 + 2b\,T_P$ is the maximum frequency. The sweep bandwidth is then $2\,bT_p$ and if the unit pulse is $u(t)$ a single pulse could be described as:

$$ S(t) = u(t) e^{j 2 \pi (f_0 t + b t^2)} \text{.}$$

I learn by doing, so I created a linear FM waveform with a duration of 0.1 milliseconds, a sweep bandwidth of 100 kHz, and a pulse repetition frequency of 5 kHz. Then, we will add noise to the linear FM pulse and filter the noisy signal using a matched filter. We will then observe how the matched filter works with and without spectrum weighting.

Which produces the following chirped pulse,

lfm1

From here, we create two matched filters: one with no spectrum weighting and one with a taylor window. We can see then see the signal input and the matched filter output:

matched_filter_in_and_out

To really see how this works we need to add some noise:

{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} Create the signal and add noise.
sig = step(hwav);
rng(17)
x = sig+0.5*(randn(length(sig),1)+1j*randn(length(sig),1));

And we can see the impact noise has on the original signal:

input_plus_noise

and the final output (both with and without a Taylor window):

output

The Ambiguity Function

While it is cool to see the matched filter working, my background is more in stochastic modeling and my interest is in the radar ambiguity function — which is a much more comprehensive way to examine the performance of a matched filter. The ambiguity function is a two-dimensional function of time delay and Doppler frequency $\chi(\tau,f)$ showing the distortion of a returned pulse due to the receiver matched filter due to the Doppler shift of the return from a moving target. It is the time response of a filter matched to a given finite energy signal when the signal is received with a delay $\tau$ and a Doppler shift $\nu$ relative to the nominal values expected by the filter, or:

$$
|\chi ( \tau, \nu)| = \left| \int_{-\infty}^{\infty} u(t)u^* (t + \tau) exp(j 2 \pi \nu t) dt \right| \text{.}
$$

What is the ambiguity function of an uncompressed pulse?

For an uncompressed, rectangular, pulse the ambiguity function is relatively simple and symmetric.

ambigFun

What does the ambiguity function look like for the LFM pulse described above?

If we compare two pulses, each with a dutycycle of one (PRF is 20 kHz, and pulsewidth is 50 µs), we can see their differing ambiguity functions:

pulse_comparisons

If we look at the ambiguity function of an LFM pulse with the following properties:

SampleRate: 200000
        PulseWidth: 5e-05
               PRF: 10000
    SweepBandwidth: 100000
    SweepDirection: 'Up'
     SweepInterval: 'Positive'
          Envelope: 'Rectangular'
      OutputFormat: 'Pulses'
         NumPulses: 5

then we can see how complex the surface is:

ambig funct 3d
.

References

  • http://www.ece.gatech.edu/research/labs/sarl/tutorials/ECE4606/14-MatchedFilter.pdf
  • Matlab help files
By 0 Comments

One Time Pad Cryptography

This was much harder than it should have been. While this is the certainly the most trivial post on crypto-math on the webs, I wanted to share my MATLAB xor code in the hope that I save someone else’s time. It is a basic property of cryptography that a one time pad must be used only once. A example like this makes it very concrete:

Suppose you are told that the one time pad encryption of the message “attack at dawn” is 09e1c5f70a65ac519458e7e53f36 (the plaintext letters are encoded as 8-bit ASCII and the givenciphertext is written in hex). What would be the one time pad encryption of the message “attack at dusk” under the same OTP key?

Let $m_0$ be the message “attack at dawn”, then $m_1$ is the message “attack at dusk”, and $c_1$, $c_2$ the corresponding ciphertexts. If $p$ is the one-time pad (OTP) that encrypts the message, we then have:

$$ c_0 = m_0 \oplus p$$

So we can obtain the one-time pad by performing an XOR of the ciphertext with the plaintext:

$$ p = c_0 \oplus m_0 \text{.}$$

This enables us to encrypt the new message without using the OTP explicitly:

$$c_1 = m_1 \oplus p = m_1 \oplus \left(c_0 \oplus m_0 \right) = c_0 \oplus (m_0 \oplus m_1) \text{.}$$

You could truncate down to the only characters that are different, but since I’m writing a script for this, I didn’t bother.

In python this would be super short:

def strxor(s1,s2):
return ''.join(chr(ord(a) ^ ord(b)) for a,b in zip(s1,s2))

strxor(strxor("6c73d5240a948c86981bc294814d".decode('hex'), "attack at dawn"), "attack at dusk").encode('hex')

>>>'6c73d5240a948c86981bc2808548'

But, the government won’t allow me to have python on my laptop and I need to use matlab.

Some Helpful Links

  • My course slides: http://spark-university.s3.amazonaws.com/stanford-crypto/slides/02-stream-v2-annotated.pdf
  • Another Solution: http://crypto.stackexchange.com/questions/10534/how-to-decode-an-otp-message
  • Some good general info: http://www.binaryvision.nl/3075/statistical-test/
By 0 Comments

Cryptography

We are swimming in a sea of information. Without encryption this whole operation would be a very public event. Encryption enables our communications to be anonymous or secure and makes e-commerce and private communications possible. Because of this, I registered for Dan Boneth’s cryptography course. At this point, I’ve only watched one lecture, but I have some initial thoughts (and some code) that I wanted to jot down.

At it’s most simple, Encryption is used to convert data into a form that a third party can’t read. There are multiple reasons for wanting to do this, but the most common is a desire for security or privacy. In a more comprehensive sense, the aim of cryptography is to construct and analyze protocols that overcome the influence of adversaries and which are related to various aspects in information security such as data confidentiality, data integrity, authentication, and non-repudiation. Modern cryptography is heavily based on mathematical theory and assumptions about the capability of current computer hardware. No code is infinitely secure, and is theoretically possible to break, but a “secure code” is considered infeasible to break by any known practical means. The study of encryption is of high interest to me, because it intersects many of my current interests: the disciplines of mathematics, computer science, and electrical engineering.

Dan Boneth’s initial lecture covered the traditional overview, definition and history of cryptography. His overview was rich with basic applications of cryptography and its role in our lives today.

The concept of encryption and decryption requires some extra information to encode and decode the signal. Though this information takes many forms, it is commonly known as the key. There may be cases when same key can be used for both encryption and decryption (a shared key) while in certain cases, encryption and decryption may require different keys (such as a public/private key arrangement) and this is one way to organize existing techniques.

He provides a strong admonishment to use “standard cryptographic primitives” that have withstood public scrutiny and makes the point that without the necessary peer-review by a very large community of hundreds of people for many, many, many years, one can’t trust a cryptographic implementation. For this reason he admonishes the student to never trust an implementation based on proprietary primitives. (The student is left to wonder what exactly a cryptographic primitive is.)

He highlights that cryptography has its limitations and even a secure cryptographic channel does not guarantee a secure implementation. It was helpful that he followed this statement up with what exactly an insecure implementation is by surveying how to break different ciphers. He mentions an known insecure Blu-Ray protection standard called AACS and mentions an additional forthcoming discussion of the mechanics of its compromise.

From here, he discusses applications such as private elections and auctions and also the mechanism of digital signatures. He ends the lecture by discussing some of the “magic” recent developments in encryption such as homomorphic encryption — where operations can be accomplished on encrypted data without decryption. (See the DARPA proceed program.) This has fascinating applications such as the ability to query a database without providing an observer of database (before, during or after) any insight into the nature of the query.

He closes with a discussion stating that any cryptographic implementation has three requirements: precise specification of a threat model, a proposed construction, and a proof that breaking construction under threat mode will solve an underlying hard problem.

The next lecture was my favorite. Here, Boneth surveyed the history of cryptography which included a lot of the codes you play with in school such as symmetric and substitution cyphers, along with a discussion how to break these by using frequency and pattern recognition techniques. (Mostly based on known distributions of letters in the underlying language.)

He then introduces interesting ciphers such as the caesar cipher. In the Caesar cipher each letter of the alphabet is shifted along some number of places; for example, in a Caesar cipher of shift 3, A would become D, B would become E, Y would become B and so on. He then moves on to more complex ciphers such as the Vigenère cipher (a simple form of polyalphabetic substitution developed in Rome in the 16th century) and an interesting discussion of rotor machines (the most famous of which was the German Enigma). The Vigenère cipher consists of several Caesar ciphers in sequence with different shift values. He ended with the lecture with a quick description of modern encryption techniques.

I always enjoy these introductory lectures of a course — I can generally follow everything and am excited about the material. A lot of my friend in college would not go to the first lecture, but I never missed it. It was always go to at least one lecture where I could follow along. This sounds like it will be an interesting ride.

Enough of the discussion; let’s see how this works. As discussed above, the Vigenère cipher produces encrypted ciphertext from an input plaintext message using a key and a matrix of substitution alphabets. From the MATLAB below, you can generate the Vigenère square, also known as the tabula recta, this table can be used for encryption and decryption. It consists of the alphabet written out 26 times in different rows, each alphabet shifted cyclically to the left compared to the previous alphabet, corresponding to the 26 possible Caesar ciphers. At different points in the encryption process, the cipher uses a different alphabet from one of the rows. The alphabet used at each point depends on a repeating keyword.

Let’s try this with MATLAB (see code below). If I use a key of ‘Ceasar knows something you do not’ and a secret message of Titus Lucretius Carus (a Roman epicurean epic poet), we get:

encrypt('Titus Lucretius Carus', 'Ceasar knows something you do not')
VMTLSQKDPE KHLFLGTYBE
decrypt('VMTLSQKDPE KHLFLGTYBE', 'Ceasar knows something you do not')
TITUS LUCRETIUS CARUS

It works!

By 0 Comments

Review: Boys Should Be Boys: 7 Secrets to Raising Healthy Sons

From a one-star review on Amazon:

The content was obvious and the tone was judgmental. The complete lack of nuance is painful. Apparently receiving an MD over 25 years ago makes this Dr. Laura-style author an expert in child psychology? Let’s leave the psychology topics to those professionally trained in that discipline.

I’ve always enjoyed Meg Meeker’s books and her latest is no exception. She’s very practical and conversational, but she brings together a refreshing mix of social conservatism with practical medical know-how. Every chapter is focused towards concrete advice for parents to become more effective in crafting the virtue (and therefore well-being) of our sons. As the quote above shows, she presents a perspective that is out of sync with aspects of modern culture, and in particular with modern medicine’s trend of hyper-specialization and the unwritten rule to leave moral judgments out of medical advice. We all know this trend has risks: a specialist is going to miss the whole person concept that is critical to understand as we tackle a problem as complex as parenting. By forcing those who give us medical and parenting advice to be materialists is to force life’s great questions out of the discussion. A materialist view misses the most important dynamics in developing character and sons who become men of virtue.

Meg Meeker explains that boys need the freedom to explore and test their limits, even if this means some scrapes, bruises and difficult moments. She tries to strike the balance between helicopter and laissez-faire parenting. She slices through these two extremes with a simple call to engage: to double the time we spend with our boys all the while loving them enough to force them to grow in difficult and engaging situations.

It is out of the tension of caring too little and not caring enough that she weaves her plan for an ideal father. In many ways, I find her book more interesting for what it says not to do than what it says to do. She reminds us of the danger in letting our boys be cast adrift into a toxic mix of video games, ersatz online relationships, and a hyper-sexualized culture that emphasizes in individual’s emotion over an external, and fixed, framework of morality.

She makes it clear that there’s no substitute for personal time and attention. She paints the ideal parent as always engaged and aware of what their children are doing in a manner that doesn’t dictate the details of their life but does pour compassion and love into their schedule while allowing them to grow and develop in natural situations. In reading her book, she makes it clear that to avoid the harmful influences of society, we as fathers have to be committed and focused to protecting them in fostering the right environment which allows them to develop in healthy ways.

In 12 chapters she starts with a review of the problem and then goes over seven areas of focus. Here, in brief they are:

  • Know how to encourage your son. One fault is babying and spoiling him. But another is being so harsh that you lose communication with your son and destroy his sense of self worth. We’ll look at how to strike the right balance.
  • Understand what your boys need. Guess what? It’s not another computer game; it’s you. We’ll look at how to get the most of your time with your son.
  • Recognize that boys were made for the outdoors. Boys love being outside. A healthy boy needs that sense of adventure— and the reality check that the outdoors gives him.
  • Remember that boys need rules. Boys instinctively have a boy code. If you don’t set rules, however, they feel lost.
  • Acknowledge that virtue is not just for girls. Boys should, indeed, be boys—but boys who drink, take drugs, and have sex outside of marriage aren’t “normal” teenagers, they have been abnormally socialized by our unfortunately toxic culture. Today, my practice as a pediatrician has to deal with an epidemic of serious, even life-threatening, problems—physical and psychological—that were of comparatively minor concern only forty years ago. A healthy boy strives after virtues like integrity and self-control. In fact, it is virtues like these that make a boy’s transition to manhood possible.
  • Learn how to teach your son about the big questions in life. Many parents shy away from this, either because they are uncomfortable with these questions themselves, or want to dismiss them as unimportant or even pernicious, or because they don’t want to “impose” their views on their children. But whatever one’s personal view, your son wants to know— and needs to know—why he’s here, what his purpose in life is, why he is important. Boys who don’t have a well-grounded understanding on these big questions are the most vulnerable to being led astray into self-destructive behaviors.
  • Remember, always, that the most important person in your son’s life is you.

In the second chapter, she addresses how to deal with peer pressure with a particular emphasis on how toxic are culture is for boys and their identities. She goes on from this to discuss boy’s natural tendencies and how helpful rough and dangerous activities can be. This is exactly the natural state of boys development. She points out that neighborhood games practiced by boys with different ages force them to learn important life lessons which they can’t learn anywhere else.

In the fourth chapter, she explores the role between electronics, virtual worlds and the influence they have on the development of young boys. I’d recommend to all parents of young children read Parmy Olson’s book “We are Anonymous” to better understand how amazingly toxic (and captivating) the underbelly of the internet is for children. (Something we all know, but she brings it forward in vivid detail.)

In an interesting turn, she then explores the societal animosity towards teenage boys. I just finished Sheryl Sandberg’s Lean In in which she makes an excellent case for societal biases against women leaders in business. By contrast, Meeker makes an excellent point that society presents a self reinforcing feedback loop that cast teenage boys as moody, depressed and angry. She makes some excellent points that have particular poignancy coming from a medical professional who is used to dealing with teenage boys: it is okay to be depressed and moody and not the cause of alarm or overreaction from parents. The solution is more old-fashioned that our modern and hyper specialized world wants: more time and attention — in the context of a strategic perspective.

However, the focus of our increased time and attention is the subject of the next chapter. Chapter 6 talks about practical ways to build self-confidence and mental health in our boys. She talks about the critical importance of the fathers blessing, something that always proved far too elusive for me. She describes the feeling of true accomplishment as a powerful emotional resource builder. (I think it’s helpful to contrast true accomplishment with the common empty fawning praise and declarations of how special my children are that they find in school.)

Particular convicting is her clarion call to live my life in an exemplary way that sets the right standards for my son. I want to model the virtues that he should have and she challenges us to picture our son at the age of 25 and to foster those virtues we desire in him — much in the same way Dan Allender’s Bold Love tells us to carefully and tirelessly pursue love with the cunning of a fox.

She moves on in the next chapter to discuss why so many men are merely aged adolescents: They never got through the transition from being a boy to becoming a man. She diagnoses this, in her clinical way, as the result of the absence of a father’s guidance.

She matches this with the next chapter that talks about the importance of faith and of the knowledge of an external God to whom boys feel accountability. She describes how a faith in God helps children to have a well of hope to draw from as life gets tough and develop an understanding of love that is more than a pleasurable act between bodies, to understand the importance of truth and accountability as well as the critical importance of repentance forgiveness and grace to young child. Here as in the rest of the book she makes it clear that this doesn’t mean simply dropping off your son at church and hoping he finds God — she calls fathers to again being the best that they can be for themselves but also for their sons — and model the ideal behavior.

My biggest criticism of her books is the way she remains generic towards faith. While a Judeo-Christian concept of God has been foundational to a historical US worldview, I think she should be more honest in explaining the particular faith she holds and its critical nature to our sons of eternal destiny. A general “faith” without conviction is not what we want our sons to have. Is she really advocating to teach our sons about Islam? She remains neutral on how God is defined — without question — to reach a broader audience. But her own faith of Christianity claims exclusivity, and I found it disappointing that she avoided this.

Her book culminates in the 11th Chapter where she calls us to ideate the core virtues we want our sons to have that will ensure they make the transition from boy to man. She emphasizes virtues we all want in our children such as integrity, courage, humility, meekness, and kindness. She doesn’t just introduce these as words but fully fleshes them out into concepts and practical steps to build them in our sons.

She ends the book with 10 tips to remember and a call to double whatever time we currently are investing in our sons. Here are the 10 tips:

1) Know that you change his world 2) Raise him from the inside out (worry about his inner life and the outer life will follow) 3) Help his masculinity to explode 4) Help him find purpose and passion (other than being a video game master) 5) Teach him to serve (this is where Church can come in handy) 6) Insist on self-respect 7) Persevere 8) Be his hero 9) Watch, then watch again (pay close attention to what is going on in his life) 10) Give him the best of yourself (not just the leftovers)

By 0 Comments

GE GXULQR Twist and Lock Kitchen or Bath Filtration System Replacement Filter Won’t Fit

If you ever have to replace your under the sink GE filter (specifically the GE GXULQR), be ready for a frustrating ride.

It looks like this:

filter

You can find the manual here and the ge product page here.

It is billed as the “Twist and Lock” Replacement Filter, the inside female hex socket sometimes rotates out of alignment when you remove the old filter. If this happens, there is no easy fix. Moreover, since the receiving head is generally under a sink or in some place you can’t easily access, it is very frustrating when you are pushing into the plastic receiver and the new filter doesn’t fit. (It is extra frustrating if you have a cast on a broken right hand, carpal tunnel in the your left hand and can’t bend your back due to a herniated L5/S1.)

head

The instructions only say to “Push filter into the filter head/bracket. Turn filter 1/4 turn to the right until it stops. The top surface of the filter will be flush with the bottom of the filter head/bracket when fully installed.” This only works if everything is properly lined up. The solution is to re-align the receiver by making a new tool from the old filter. By cutting off the head of the old filter (I used an angle grinder), and hammering in a flathead screwdriver, you have a custom alignment tool that looks like this:

custom tool

How far should you rotate it? Enough to get the socket to line up so the flanges align. For me, this meant the top of hex was flat. The biggest challenge is knowing how hard to turn. I still don’t know the internal mechanism, and the casing is all plastic and I didn’t want to break it. For me, I had to rotate it pretty hard before it turned. This was a little scary because water started to leak out from inside. However, after my wife rotated in the cartridge (two hands were necessary, and a cast doesn’t help), everything seems to be working fine.

Hopefully, this spares you some frustration.

By 9 Comments

Gradient Descent

[mathjax]

It has been awhile since I’ve studied optimization, but gradient descent is always good to brush up on. Most optimization involves derivatives. Often known as the method of steepest descent, gradient descent works by taking steps proportional to the negative of the gradient of the function at the current point.

Mathematically, gradient descent is defined by an algorithm for two variables, say \theta_0, \theta_1 as repeated updates of:

$$ \theta_j := \theta_j – \alpha \frac{\partial J (\theta_0 , \theta_1) }{\partial \theta_j } $$

from the hypothesis:

$$ \begin{aligned}
h_\theta(x) = \sum_{i=1}^n \theta_i x_i = \theta^{T}x.
\end{aligned} $$

The \alpha is the learning rate. If \alpha is very large, we will take some huge steps downhill, small \alpha would mean baby steps downhill. If \alpha is too small, it might take too long to get our answer. If alpha is too large, we might step past our solution and never converge. Besides the pitfall of picking an  \alpha , you have to have a cost function with existing derivatives (continuous) and a convex function.

Often for linear regression, we use batch gradient descent. In machine learning terminology, batch gradient descent uses all training examples. This means that for every step of gradient descent, we compare all residuals in the final least squares calculation.

Why is this relevant to machine learning? Of course there is always an analytical solution to any model, but this might involve a very large matrix inversion and be computationally infeasible.

$$ \theta = (X^T X)^{-1} X^T y $$

Gradient descent gets used because it is a numerical method that generally works and is easy to implement and a is very generic optimization technique. Also, analytical solutions are strongly connected to the model, so implementing them can be inefficient if you plan to generalize/change your models in the future. They are sometimes less efficient then their numerical approximations, and sometimes there are simply harder to implement.

To sum up, gradient descent is preferable over an analytical solution if:

  • you are considering changes in the model, generalizations, adding some more complex terms/regularization/modifications
  • you need a generic method because you do not know much about the future of the code and the model
  • analytical solution is more expensive computationaly, and you need efficiency
  • analytical solution requires more memory or processing power/time than you have or want to use
  • analytical solution is hard to implement and you need easy simple code
By 0 Comments

FlexEvents: WOD 3 Simulation

During a recent CrossFit competition, flex on the mall, we had to accomplish a team chipper in which we were able to pick the order of team members in order to minimize our workout time. The workout was simple: 50 burpees, 40 over the box jumps, and 30 kettle-bell snatches. Each exercise had to be accomplished in serial: no one could start an exercise until their previous team member had finished. This meant everyone waited while the first person started. As simple as this was, I got confused when figuring the optimal order for team members: should the slowest go first or last?

At the time, I thought the best strategy would be to have the fastest go first, to prevent the scenario where the fast folks were waiting and unable to help the team. I was focused on the idea that you didn’t want anyone to be waiting — so clear out the fast people first. My partners wanted the slowest to go first because the slower participants could rest, go super slow and not affect the final score.

While I was wrong, and they were correct, this didn’t make sense at the time, because I was viewing the whole thing as a linear operation where order didn’t matter, but waiting on someone would definitely slow down the overall time. It turns out that if you put the slowest last, no one is waiting, but the clock is rising totally based upon the slowest time, when you otherwise could have obscured their slow time. The workout came down to the following critical path: the time it took to do everyone’s burpees plus the last participant’s time.

However, this is only true if the time it took to do burpees was significantly more than the other events and there was not a significant difference in fitness between team members. After the competition, I wanted to understand the dynamics of this workout and build a quick model to understand where these assumptions held true and hone my intuition for stuff like this.

It turns out the worst thing to do is have the slowest person go last. The reason why is really simple: you are putting them in the critical path. In fact, an optimal strategy is to have the slowest always go first. If I assume all four members have different speeds, and made an assumption like this based on expected values of their workout times. Let’s assume four members have the following expected completion times (in notional units), where person 1 was the fastest and each successive participant was slower in all events.

Note: These are totally made up numbers and have nothing to do with our team . . . says the man whose wife now routinely kicks his scores to the curb.

person 1 2 3 4
burpees 9 10 15 17
box jumps 7 8 13 15
KB snatch 5 7 11 13

_

In this case, I wrote some Matlab to look through all (4! = 24) permutations. (Remember: 1 = fastest, 4 is slowest.)

CrossFit Order

This is such a simple event that you can see the basic building blocks without much introspection: 21 is the fastest time to complete the sequence and 45 is the slowest. If the team were all comprised of person 1 they could complete the whole thing in 48. As this fictional team is, burpees alone will account for 51 regardless of order, but if the fastest goes at the end, you can get away with only adding 15 on to the total time, versus adding 28 if the slowest person goes last.

Several things here. It is so simple to work the math of this with variables instead it probably should be done. A better analysis could show you how much variation you could tolerate between athletes before the problem dynamics above don’t apply. Maybe another lunch break for that.

OK, on some more thought, I thought I would look at several scenarios and show optimal strategies.

A team member is much slower on burpees, but faster on the other events.

In this case, it still makes sense for her to go last. The burpees are linear, but speed in the bottom two events determines the order. It helps me to assign a color to each team member and see all 24 strategies on one plot. The lightest shade is member one who has a burpee time of 40 compared to 4 for the others, but is faster in the other two events than the others. In the plot below, the fastest combinations are at the top where you can see that all the fastest outcomes have member one going last.

burpees slower on 1

Burpee times equal KB snatch times

So if their event times look like this where everyone’s KB snatch times equal their burpee times.

burpee times:    1   2   3   4
box jump times:  2   2   2   2
KB snatch times: 1   2   3   4

Then all outcomes are equal and you get a truly random pattern where all outcomes are equal regardless of order. So in this case it doesn’t matter what order you do the workout in, even though 4 is much slower than 1. Interesting and counter-intuitive.

colors

My code is below for any of those interested.

By One Comment

CY2014 Quarter 1 Financial Review

Chrissy and I review our spending on a quarterly basis. Updating every 90 days isn’t too long to correct mistakes and remember purchases, but it also allows for the busy multi-week sprints that life presents us. While we have used every financial management program available, I’ve found the most straightforward and flexible solution is to download historical transactions into Excel where I can assign categories and do the type of analysis you can see below. This works for me because I have complete control. All the other solutions I used (MS Money, Quicken, Mint, GNU Wallet) introduce errors that have required lots of time to fix (or that can’t be fixed), but more importantly they constrain me to their interface and I got used to exporting information into tools that could flexibly answer my questions.

My basic workflow is to download statements from all our bank accounts and credit cards in put them all into one spreadsheet, where I ensure a consistent list of categories. I can do this quickly by filtering and sorting as most of our expenses are cyclical. Once everything is in the right format, I use lots of Excel SUMIF and SUMIFS functions to produce reports.

My purpose of doing a financial review is intended to accomplish the following:

  • Quality check (Are we getting paid the right amounts? Any incorrect expenses?)
  • Spending feedback (Are we overpaying in any categories? Anything we need to reign in?)
  • Tax Production

While my tax production and quality check was very helpful to me, I wanted to share the results of the spend analysis in case my reports might be useful to others.

Spending feedback

In summary, we had a small rise in our overall Grocery and Dining out categories, but the major cost drivers were:

  • Ellie’s 12 cavities were very expensive (no dental insurance)
  • We bought a new espresso machine (major purchase for us)
  • We bought a new car
  • We went crazy on clothes
  • Committed (again) to Army Navy Country Club

spending-by-category

Where are we spending?

This doesn’t have a real effect on our spending, but I thought this was interesting. We don’t have saving/investments in here, this is just “spending”. I treated stuff like insurance, taxes, medical, fees, haircuts, etc as “cost of life” — things I feel we can’t avoid and don’t really have discretion in spending. Some other stuff that might fit this category (power bill) gets lumped into household (as does home maintenance and mortgage). I would love to do some more analysis and compare our spending to this article.

pie

Daily Feedback

The plot below has categories on the Y-axis and days on the bottom. Intensity of color is the spend amount. I used matlab to produce this plot. I like it because the colormap used filters everything in way that comes out like a log scale — and that tells me what is a big deal and what is noise. The interesting dynamic is the frequency/magnitude trade that happens with spending dynamics: medical is in seldom/big chunks while grocery expenses are a constant but smaller expense.

imagesc_daily_spending

You can see that our daily spending has a huge variance: The spending had a standard deviation that was twice our average spending — big purchases had a pronounced effect. I explore four levels of spending: discretionary (dining out), some and limited discretion (haircuts, medical) and non-discretionary (mortgage, tax) at the bottom.

DailySpendingBar

Weekly Feedback

Click on the below to see full size

WeeklySpending

So how much can we control this?

If I break down spending into four categories:

  • Committed — We have to pay it (i.e. Mortgage)
  • Limited Discretion — We can commit extra time to reduce it (i.e. Home and Car Maintenance)
  • Some Discretion — We can make choices to decrease our quality of purchase (i.e. Groceries)
  • Total Discretion — We can do without this if we have to (i.e. Dining Out/New Clothes)

It turns out that a third of our expenses are committed where about a quarter each apply to limited and some discretion. Roughly 20{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} of our expenses are totally discretionary and 70{aaa01f1184b23bc5204459599a780c2efd1a71f819cd2b338cab4b7a2f8e97d4} of our expenses could be changed if we had to. The takeaway for me is to focus on eliminating the stuff we pay for but don’t enjoy (fees) and the things that don’t bring joy/reward for their cost.

By 2 Comments

Fun with Bessel Functions

[mathjax]

Well, I certainly forget things faster than I learn them. Today is a quick review of Bessel functions and their applications to signal processing.

The Bessel functions appear in lots of situations (think wave propagation and static potentials), particularly those that involve cylindrical symmetry. While special types of what would later be known as Bessel functions were studied by Euler, Lagrange, and the Bernoullis, the Bessel functions were first used by F. W. Bessel to describe three body motion, with the Bessel functions appearing in the series expansion on planetary perturbation

First, I think they should be called Bernoulli-Bessel functions both because that sounds more pompous and because they were discovered by Daniel Bernoulli and generalized by Friedrich Bessel. While they sound (and can be) complicated, they are the canonical solutions y(x) of Bessel’s differential equation:

$$ x^2 \frac{d^2 y}{dx^2} + x \frac{dy}{dx} + (x^2 – \alpha^2)y = 0 $$

for an arbitrary complex number \alpha (where \alpha denotes the order of the Bessel function). The most important cases are for \alpha as an integer or half-integer. Since all math ties together, I find it pretty cool that Bessel functions appear in the solution to Laplace’s equation in cylindrical coordinates.

Although \alpha and −\alpha produce the same differential equation that a real \alpha does, it is conventional to define different Bessel functions for these two values in such a way that the Bessel functions are mostly smooth functions of \alpha.

Bessel functions of the first kind: J_\alpha

Bessel functions of the first kind, known as J_\alpha (x), are solutions of Bessel’s differential equation that are finite at the origin (x = 0) for integer or positive \alpha, and diverge as x approaches zero for negative non-integer \alpha. It is possible to define the function by its Taylor series expansion around x = 0.

$$ J_\alpha(x) = \sum_{m=0}^\infty \frac{(-1)^m}{m! \, \Gamma(m+\alpha+1)} {\left(\frac{x}{2}\right)}^{2m+\alpha} $$

where \Gamma(z) is the gamma function, a shifted generalization of the factorial function to non-integer values.

Bessel functions of the second kind : Y_\alpha

The Bessel functions of the second kind, denoted by  Y_\alpha (x) are solutions of the Bessel differential equation that have a singularity at the origin.

For non-integer \alpha, Y_\alpha is related to J_\alpha by:

$$ Y_\alpha(x) = \frac{J_\alpha(x) \cos(\alpha\pi) – J_{-\alpha}(x)}{\sin(\alpha\pi)} $$

In the case of integer order n, the function is defined by taking the limit as a non-integer \alpha tends to n:

$$ Y_n(x) = \lim_{\alpha \to n} Y_\alpha(x). $$

There is also a corresponding integral formula (for Re(x) > 0),

$$ Y_n(x) =\frac{1}{\pi} \int_0^\pi \sin(x \sin\theta – n\theta) \, d\theta – \frac{1}{\pi} \int_0^\infty \left[ e^{n t} + (-1)^n e^{-n t} \right] e^{-x \sinh t} \, dt.$$

Y_\alpha(x) is necessary as the second linearly independent solution of the Bessel’s equation when \alpha is an integer. But Y_\alpha (x) can be considered as a ‘natural’ partner of J_\alpha (x).

When \alpha is an integer, moreover, as was similarly the case for the functions of the first kind, the following relationship is valid:

$$ Y_{-n}(x) = (-1)^n Y_n(x).\,$$

Both J_\alpha (x) and Y_\alpha (x) are holomorphic functions of x on the complex plane cut along the negative real axis. When \alpha is an integer, the Bessel functions J are entire functions of x. If x is held fixed, then the Bessel functions are entire functions of \alpha.

Bessel Filters

In electronics and signal processing, a Bessel filter is a type of linear filter with a maximally flat group delay. The Bessel filter is used because a low pass filter is characterized by transfer function. The denominator of the Bessel filter is a reverse Bessel polynomial. Bessel filters are often used in audio crossover systems. Analog Bessel filters are characterized by almost constant group delay across the entire passband, thus preserving the wave shape of filtered signals in the passband.

A low pass active filter with a Bessel response is used when the filter needs to exhibit minimum differential delay between the various frequency components of interest contained within the input signal being filtered. In essence this means that the fundamental frequency of say an applied squarewave experiences the same input-to-output delay as the other harmonics within the filter’s pass-band. This results in a high degree of fidelity of the output signal relative to the input signal.

By 0 Comments

Excel Sorting and Grouping

I had two tables downloaded from Amazon:

Items

Order Date  Order ID    Title   Category
1/26/14 102-4214073-2201835     Everyday Paleo Family Cookbook
1/13/14 115-8766132-0234619     Awesome Book A
1/13/14 115-8766132-0234619     Awesome Book B

and

Orders

Order Date  Order ID    Subtotal
1/6/14  102-6956821-1091413 $43.20
1/13/14 115-8766130-0234619 $19.42
1/16/14 109-8688911-2954602 $25.86

I’m building our Q1 2014 taxes and needed rows in the following format:

1/13/14 115-8766132-0234619 $22.43 Awesome Book A, Awesome Book B

In order to do this without using SQL, I did the following. If columns B corresponds to Order Id and C corresponds to the item Title, then I put the following formula in column N3

=+IF(B3=B2,N2 & " | " &C2,C3)

and in column O3 a column which might be named: “last before change?”:

=+IF(B3=B4,"", TRUE)

Then I could easily sort out the unwanted values. Done. Still, I would like to better automate this. Any thoughts appreciated.

By 0 Comments