Posts by:

Tim Booher

Provenance (in Computer Science)

Provenance is the ability to record the history of data and its place of origin. In general, it is the ability to determine the chronology of the ownership, custody or location of any object. The primary purpose of tracing the provenance of an object or entity is often to provide contextual and circumstantial evidence for its original production or discovery, by establishing, as far as practicable, its later history, especially the sequences of its formal ownership, custody, and places of storage. While originally limited to determining the heritage of works of art, the term now applies to wide range of fields, including archaeology, paleontology, archives, manuscripts, printed books, and science and computing. The latter is the context most relevant to my field of computer security.

In the context of data provenance, provenance documents the inputs, entities, systems, and processes that influence data of interest, in effect providing a historical record of the data and its origins. The generated evidence supports essential forensic activities such as data-dependency analysis, error/compromise detection and recovery, and auditing and compliance analysis, including the ability to detect advanced/persistent threats. Data provenance can provide a full historical record of data and its origins and the provenance of data which is generated by complex transformations such as workflows is of considerable value to scientists. From it, one can ascertain the quality of the data based on its ancestral data and derivations, track back sources of errors, allow automated re-enactment of derivations to update data, and provide attribution of data sources. Provenance is also essential to the business domain where it can be used to drill down to the source of data in a data warehouse, track the creation of intellectual property, and provide an audit trail for regulatory purposes.

The use of data provenance is proposed in distributed systems to trace records through a dataflow, replay the dataflow on a subset of its original inputs and debug data flows. In order to do so, one needs to keep track of the set of inputs to each operator, which were used to derive each of its outputs.

The w3c defines provenance as the ability to record a resource in order to describes entities and processes involved in producing and delivering or otherwise influencing that resource. Provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility. Provenance assertions are a form of contextual metadata and can themselves become important records with their own provenance.

Why do we care?

Because provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility and assertions of provenance can themselves become important records with their own provenance. The widespread use of workflow flow tools for processing scientific data facilitate for capturing provenance information. The workflow process describes all the steps involved in producing a given data set and, hence captures it provenance information. Provenance can be used to record metrics such as data creator/data publisher, data creation date, data modifier & modification date, or data description.

There are two major strands of provenance for computer science: Data Provenance and Workflow Provenance. Data provenance is fine-grain and is used to determine the integrity of data flows. It is a description of the origin of a piece of data and process by which it arrives in a database. By contrast workflow provenance is coarser in grain. It refers to records of history of the derivation of the final output of workflow and is typically used for complex processing tasks. Fine-grain provenance can further categorized into: where, how and why-Provenance. A query execution simply copy data elements from some source to some target database and where-provenance identifies these source elements where the data in the target is copied from. Why-provenance provides justification for the data elements appearing in the output and how-provenance describes some parts of the input influenced certain parts of the output.

References

wikipedia on data lineage
scale free networks
basic vector clock description

March 3, 2015 By Tim Booher 0 Comments

Career & Productivity

Busy IMA preparing for a Lt Col Board

Hello fellow IMA. My apologies to you. Life is not easy. In the civilian world, you work hard, play office politics and with a little luck you might get promoted. Not so in the reserves. Here your promotion depends heavily on your ability to decode a bunch of Air Force personnel jargon and to make a lot of non-cooperative admin types take care of someone who they really don’t see as their responsibility. I hope my story helps you out.

To start preparing for a recent board, I had to look up some basic information to answer the following:

When is my board?
How do I know if I’m eligible?
When is my PRF due? When does it have to be signed and where does it need to be delivered to?
How do I review (and potentially change my records)?

PRF

Before answering these questions I had to write my PRF. Why do IMAs write every word of their PRFs and OPRs? Because IMAs are always shafting their reserve boss because of the demands of our main job and the last thing we want to do is have someone go through the torture of the AF evaluation system when we’ve been so lame.

But nothing is easy — the only time I have to work this is while I’m flying from DC to Vegas and I’m on my Mac at 35 kft. I have a draft of last years PRF but it is in $xfdl$ format. My mac is not any mac, it is a government mac from my day job so I can’t install any software. Oh yes, this is totally doable, I’m an engineer. Bring it. So the XFDL is base64 zipped. To learn this, I connected to a free cloud based bash shell VPS (seriously cloud 9 IDE for the win) and cat the top of the xfdl and see:

application/vnd.xfdl;content-encoding="base64-gzip"
H4sIAAAAAAAAC+29eZea2NY4/L+fgjf3eZ+kl0khM3QneRYCKoqAguO6a/ViVBQBGZw+/e8ctGat

so no probs here . . . because I’m on a shell with root I can use uudeview under linux to decode a xfdl into a zipped xml file and then extracted it to view in emacs. Happy to explain this in more detail if you email me at tim@theboohers.org for other questions, I recommend you call the total force service center at Comm 210-565-0102.

uudeview my_prf.xfdl
mv UNKNOWN.001 my_prf.gz
gunzip my_prf.gz
cat my_prf

What do non-hacker IMAs do? Ok so I can parse XML easily enough to get the following from here.

The document to make sure you have in your hip pocket is AFI 36-2406 OFFICER AND ENLISTED EVALUATION SYSTEMS. It is probably the worst written document possible for quickly finding what you need, but it is the guide for how this is all supposed to work.

When is my board?

According to ARPCM 15-17 CY16 ResAF Board Schedule my board meets on 13-18 Jun. I found this via myPers or https://gum-crm.csd.disa.mil/.

It provides this excellent summary table:

Screen Shot 2016-03-01 at 7.14.06 PM

How do I know if I’m eligible?

The most helpful document was the ARPCM_16-02 CY16 USAFR Lt Col Convening Notice, which I dug around on MyPers to get. From this document I found out that I would need a date of rank for a Lieutenant Colonel Mandatory Participating Reserve (PR) board to be less than 30 Sep 10. I can see that my DOR is 29 APR 2010 and that fits in the window of the oldest and youngest members for the board:

DAILEY, MELISSA A./30 Sep 10 VANMETER, BRETT A./1 May 02

When is my PRF due? When does it have to be signed and where does it need to be delivered to?

From 36-2406, I know then that an eligible officer’s senior rater completes the PRF no earlier than 60 days prior to the CSB: which for me is Thursday, April 14, 2016.

From the table above, I see this confirmed that my senior rater (the USD(P)) has to sign the document between 14 Apr 16 and 29 Apr 16 and I get the completed document by 14 May 16. I can’t find how the PRF gets to the board, but I’m just going to bug the unit admin until I can confirm the document is in.

How do I review (and potentially change my records)?

Check your records on PRDA. So I was missing two OPRs and an MSM. Wow. The key here was working my network and finding the (amazing) admin at ARPC/DPT who had direct access to the records database and was able to update it for me before the board.

March 2, 2015 By Tim Booher One Comment

Engineering & Code

What was the most Geeky Event in History?

While I can’t get into the background here, I was recently asked to describe the most geeky event in history. Something like this is hard to bound and as I asked those around me I got lots of answers from moment the zero was invented (“without that, you got nothing”) to the first demo of the transistor (pretty good one) to the Trinity Test for the Manhattan Project. All of these were interesting, but “Geeky” implies not necessarily profound, but technical, funny, entertaining and weird. As I tried to put together my answers, I came up with the following goals:

geeky stuff is funny, so the moment should at least make you smile
geeky stuff is esoteric and should surprise and be a little weird to the general public
geeky stuff is technical; a geeky moment should be the culmination of a long technical slog and unveil a new tech in a novel way
geeky stuff is changing the world, so this moment should have at least a brush with history

So, I came up with four events: (1) A profound historical event, (2) A funny event, (3) a profound economic event, and (4) an tech awe-inspiring event.

The Profound Event

On 9 December 1932, 27 year old Marian Rejewski (a Polish mathematician) was given some German manuals and monthly keys for their Enigma. On top of his recently developed theory of permutations and groups, this enabled him to work out the Enigma scrambler wiring after he had the insight that the Nazi keyboard was wired to the entry drum, not in keyboard order, but in alphabetical order. He describes what happened next: “The very first trial yielded a positive result. From my pencil, as if by magic, begin to issue numbers designating the wiring”. By Christmas he was decoding nearly all German communications. Not only was this one of the greatest breakthrough cryptologic history, but it eventually resulted in the British capability listen to every conversation made between Germans. This played a big role in decimating a superior German Navy and Luftwaffe, and possibly swaying the war in favor of allies, in addition to starting the modern field of computer science.

The Funny Event

In the fall of 1971, the twenty-year-old Wozniak was inspired by Esquire to build a “Blue Box” that enabled him to create the 2600Hz tone to control the, then analog, phone network. He took the name “Berkeley Blue” while his 17 year old business partner Steve Jobs took the call sign “Oaf Tobar.” After a visit to the original phone phreak, Captain Crunch, and a short detainment by the Police, he decided to put his blue box to good use and do something both epic and geeky: crank call the Pope. Using his blue box he managed to route his call to the Vatican. “In this heavy accent I announced that I was Henry Kissinger calling on behalf of President Nixon. I said, ‘Ve are at de summit meeting in Moscow, and we need to talk to de pope.'” The Vatican responded that the pope was sleeping but that they would send someone to wake him. Woz arranged to call back in an hour.
Woz recalls, “Well, an hour later I called back and she said, ‘Okay, we will put the bishop on, who will be the translator.’ So I told him, still in that heavy accent, ‘Dees is Mr. Kissinger.’ And he said, ‘Listen, I just spoke to Mr. Kissinger an hour ago.’ You see, they had checked out my story and had called the real Kissinger in Moscow.”
Aside from the raw audacity required to prank world leaders, there is something very geeky about building a piece of hardware out of spare parts, mastering the world-wide phone system and roping in a key source of the modern desktop, desktop publishing, digital entertainment and mobile devices in the operation.

A Profound Geeky Economic Event

The moment when Dan Bricklin – wrote the original spreadsheet program, VisiCalc, on an Apple II. It is hard to overstate the impact spreadsheets have had on the modern economy and the PC industry . . . and it was written in assembler, which adds major geek cred.

Tech Awe-Inspiring Event

I’ve always been in awe of Von Neumann and his impact on science and defense technology. In particular, in May 1945, he hopped on a train from Aberdeen, Maryland to Los Alamos, New Mexico. At the time, he was solving some of the hardest problems on the Manhattan project and was traveling to go provide technical advise leading up to the trinity test. He had already established game theory, numerical analysis and a good bit of quantum physics. As a side project, he had just seen a demonstration of the ENIAC computer and had been asked to write a memorandum describing the project. I’ve written a lot of memos describing government projects, but von Neumann’s 101 page hand-written “The First Draft of a Report on the EDVAC” was finished by the time he arrived in Los Alamos. He mailed it to his colleague Herman Goldstine, who typed it up, got it reviewed, and published it the next month (without any patent). His memo described the architecture of the EDVAC, the follow-on to ENIAC which was to become the world’s first von Neumann machine and was the first published description of the logical design of a computer using the stored-program concept that formed the blue-prints for 50 years of computers.

February 26, 2015 By Tim Booher 2 Comments

Engineering & Code

Halt!

You can’t work long in computer science until you have to understand the halting problem. The halting problem is concerned with determining whether a program will finish running or continue to run forever. It has a storied history. In 1936, Alan Turing proved that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist. His proof was made famous since in this paper, he provided a mathematical definition of a computer and program, which became known as a Turing machine and the halting problem is undecidable over Turing machines and is one of the first examples of a decision problem.

Programmers generally try to avoid infinite loops—they want every subroutine to finish (halt). In particular, in hard real-time computing, programmers attempt to write subroutines that are not only guaranteed to finish (halt), but are guaranteed to finish before the given deadline.

But isn’t it easy to avoid infinite loops? Why is the halting problem so important? Because a lot of really practical problems are the halting problem in disguise. For example, say you want a compiler that finds the fastest possible machine code for a given program. You have JavaScript, with some variables at a high security levels, and some at a low security level. You want to make sure that an attacker can’t get at the high security information. This is also just the halting problem. Or, for example, if you have a parser for your programming language. You change it, but you want to make sure it still parses all the programs it used to. This is one more example of the halting problem. If you have an anti-virus program, and you want to see if it ever executes a malicious instruction, this is another example of the halting problem in action.

Moreover, the Halting problem lets us reason about the relative difficulty of algorithms. It lets us know that, there are some algorithms that don’t exist, that sometimes, all we can do is guess at a problem, and never know if we’ve solved it. If we didn’t have the halting problem, we would still be searching for Hilbert’s magical algorithm which inputs theorems and outputs whether they’re true or not. Now we know we can stop looking, and we can put our efforts into finding heuristics and second-best methods for solving these problems.

February 10, 2015 By Tim Booher 0 Comments

Career & Productivity

Family Photo Management

Post still in work

Our family pictures were scattered over several computers, each with a unique photo management application. In an effort to get a good backup in place. I moved all my pictures to one computer where I accidentally deleted them. (Long story.) I was able to recover them all, but I had numerous duplicates and huge amounts of other image junk. To make matters much more complicated. I accidentally moved all files into one directory with super-long file names that represented their paths. (Another long story.) Yes, I should have built backups. Lesson learned. In any case, while scripting super-powers can sometimes get you into trouble, the only way to get out of them is with some good scripts.

We have decided to use Lightroom on Windows as a photo-management application. Our windows machines have a huge amount of storage that we can build out quickly with cheap hard drives. However, you can imagine one problem I have to solve is to eliminate a huge amount of duplicate images at different size, and to get rid of junk images.

Removing duplicates

I wrote the following code in Matlab to find duplicates and bin very dark images. It scans the directory for all images, reduces their size, computes an image histogram, which it then wraps into 16 sections, that are summed and normalized. I then run a 2-d correlation coefficient on each possible combination.

$$
r = \frac{
\sum_m \sum_n \left(A_{mn} – \hat A \right) \left(B_{mn} – \hat B \right)
}{
\sqrt{
\left(
\sum_m \sum_n \left(A_{mn} – \hat A \right)^2
\right)
\left(
\sum_m \sum_n \left(B_{mn} – \hat B \right)^2
\right)
}
}
$$

The result are comparisons such as this one.

comparison

And a histogram of the correlation coefficients shows a high degree of correlation in general.

histogram

My goal is to use this to keep the biggest photo and put the others in a directory of the same name. More to come, after I get some help.

November 17, 2014 By Tim Booher One Comment

Personal Projects & Interests

Caging the Demon

As computers become powerful and solve more problems, the possibility that computers could evolve into a capability that could rise up against us and pose an existential threat is of increasing concern. After reading a recent book on artificial intelligence (AI), Superintelligence, Elon Musk recently said:

I think we should be very careful about artificial intelligence. If I were to guess like what our biggest existential threat is, it’s probably that. So we need to be very careful with the artificial intelligence. Increasingly scientists think there should be some regulatory oversight maybe at the national and international level, just to make sure that we don’t do something very foolish. With artificial intelligence we are summoning the demon. In all those stories where there’s the guy with the pentagram and the holy water, it’s like yeah he’s sure he can control the demon. Didn’t work out

This is a tough claim to evaluate because we have little understanding of how the brain works and even less understanding of how current artificial intelligence could ever lead to a machine that develops any sense of self-awareness or an original thought for that matter. Our very human minds use our imagination to fill in the gaps in our understanding and insert certainty where it doesn’t belong. While “dangerous” AI is a future hypothetical that no-one understands, there is no shortage of experts talking about it. Nick Bostrom, the author of Superintelligence, is a Professor, Faculty of Philosophy & Oxford Martin School; Director, Future of Humanity Institute; Director, Programme on the Impacts of Future Technology; University of Oxford. Musk, one of the most admired futurists and businessmen today, is joined by other thought-leaders such as Ray Kurzweil and Stephen Hawking in making statements such as: “Artificial intelligence could be a real danger in the not-too-distance future. It could design improvements to itself and outsmart us all.”

Bostrom gives us a name for this hypothetical goblin: Superintelligence. He defines it this as “an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills.” While we can all expect that the capability and interconnectedness of computers will continue to increase, it is Bostrom’s use of the word intellect that causes the most controversy.

Can an intellect arise from basic materials and electricity? While this question has theological implications, this seems a possibility for many today and is in some sense a consequence of using evolution to form a complete worldview. When our current fascination with monism and Darwinism is combined with a growing awareness of that our reliance on and the capability of machines is growing geometrically, we are primed to accept Bostrom’s reductionist and materialist statement:

Biological neurons operate at a peak speed of about 200 Hz, a full seven orders of magnitude slower than a modern microprocessor (~2 GHz).*

How could we mere humans ever compete? If we accept the brain and consciousness are merely the result of chemical, electrical and mechanical processes than it ought to be emulable by synthetic materials. While this would require breakthroughs in 3D printing, AI and chemistry, the question for a modern materialist is not if this is possible, but when it will occur. The argument might go: while we don’t have AI like this now, super-intelligent machines could evolve much faster than us and may consequently find little use for the lordship of an inferior species.

If some experts think this way, when do they think human intelligence and capability are likely to be surpassed? The short answer is that they don’t agree on a timeline, but there is a slight consensus that computers will be able to match human intelligence. In 2006 a survey was conducted at the AI@50 conference and showed that 18% of attendees believed machines could “simulate learning and every other aspect of human intelligence” by 2056. Otherwise, attendees were split down the middle: 41% of attendees expected this to happen later and 41% expected machines to never reach that milestone. Another survey, by Bostrom, looked at the 100 most cited authors in AI in order to find the median year by experts expected machines “can carry out most human professions at least as well as a typical human”. From his survey, 10% said 2024, 50% said 2050, and 90% said to expect human-like intelligence in 2070. His summary is that leading AI researchers place a 90% probability on the development of human-level machine intelligence by between 2075 and 2090. While his question shapes the results, and the results say nothing about a general or a self-aware machine, he clearly has some experts agreeing with him.

But many others don’t agree. The strongest argument against self-awareness is that the intelligence of machines cannot be compared to human intelligence, because of a difference in purpose and environment. We process information (and exist) for different reasons. Currently, AI is custom built to accomplish (optimize) a series of tasks — and there is no reason to assume an algorithm would automatically transition to excel at another task. Machines are not dissatisfied, and harbor no resentment. Freedom is not an objective ideal for machines. Computers can only recognize patterns and run optimization algorithms. No current technology has shown any potential to develop into self-aware thought.

In the ninetieth century, Ada Lovelace speculated that future machines, no matter how powerful, would ever truly be a “thinking” machine. Alan Turing called this “Lady Lovelace’s objection” and responded with his basic turing test (can a human distinguish between human and computer-generated answers?) and predicted that computers would achieve this within a few decades. Sixty years later, we are not even close and we still haven’t seen anything like an original thought from a computer. John Von Neumann was fascinated by artificial intelligence, and realized that the architecture of the human brain was fundamentally different than any machine. Unlike a digital computer, the brain is an analog system that processes data simultaneously in mysterious ways. Von Neumann writes:

A new, essentially logical, theory is called for in order to understand high-complication automata and, in particular, the central nervous system. It may be, however, that in this process logic will have to undergo a pseudomorphosis to neurology to a much greater extent than the reverse.

That still hasn’t happened. Current technology isn’t even moving in the direction of original thought. Chess winning Deep Blue and Jeopardy! winning Watson won by quickly processing huge sets of data. Kasparov wrote after his loss to Deep Blue: “Deep Blue was only intelligent the way your programmable alarm clock is intelligent.”* The IBM research team that built Watson agrees that Watson had no degree of understanding of the questions it answered:

Computers today are brilliant idiots. They have tremendous capacities for storing information and performing numerical calculations—far superior to those of any human. Yet when it comes to another class of skills, the capacities for understanding, learning, adapting, and interacting, computers are woefully inferior to humans; there are many situations where computers can’t do a lot to help us. *

In fact, the current direction of technology might be going the opposite direction from self-awareness. According to Tomaso Poggio, the Eugene McDermott Professor of Brain Sciences and Human Behavior at MIT:

These recent achievements have, ironically, underscored the limitations of computer science and artificial intelligence. We do not yet understand how the brain gives rise to intelligence, nor do we know how to build machines that are as broadly intelligent as we are.*

Because we don’t understand how self-aware thought develops, all we have is a fleeting mirage in the future telling us that super-intelligence might be right around the corner. Without real-science, the only data to show us the future comes from our imagination and science fiction.

However, this might change. Betting against the ability for technology to accomplish any task is a bad idea. Tim Berners-Lee makes a reasonable argument when he says, “We are continually looking at the list of things machines cannot do – play chess, drive a car, translate language – and then checking them off the list when machines become capable of these things. Someday we will get to the end of the list.”*

Currently IBM and Qualcomm are building chips patterned after neurological processes and they are developing new software tools that simulate brain activity. By modeling the way individual neurons convey information, developers are currently writing and compiling biologically inspired software. The Neuromorphic Computing Platform from the European Union currently incorporates 50*106 plastic synapses and 200,000 biologically realistic neuron models on a single 8-inch silicon wafer. Like a natural system, they do not pre-program any code but only use logic that “evolves according to the physical properties of the electronic devices”.*

Should such a project produce self awareness, how dangerous would this be when compared to other existential threats? The future will clearly have higher interconnectivity and greater dependence on machines and they will continue to become more capable. In The Second Machine Age, I agree with Erik Brynjolfsson and Andrew McAfee when they write:

Digital technologies—with hardware, software, and networks at their core—will in the near future diagnose diseases more accurately than doctors can, apply enormous data sets to transform retailing, and accomplish many tasks once considered uniquely human.

Any time there is a great deal of interdependency, there is also a great deal of systemic risk. This will apply to our transportation networks, healthcare, and military systems and is a particular problem if we can’t build much more secure software. However, the threat here is malicious use combined with vulnerable software, not rouge AI. In this context, AI is most dangerous in its ability to empower a malicious actor. If, in the future, our computers are defended automatically by computers then a very powerful AI will be best equipped to find vulnerabilities, build exploits and conduct attacks. AI will also be critical to innovation and discovery as both humans and computers collaborate on societies’ hardest problems. To be most ready for this capability, the best strategy is to have the best AI which is only possible from a well-funded, diverse and active research base.

However, what if science develops a superior artificial intellect? Waiting to pull the power-cord is not a wise strategy. Issac Assimov provided us with three laws to follow to ensure benevolent interactions between humanity and machines:

A robot may not injure a human being or, through inaction, allow a human being to come to harm.

A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.

A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Clearly, military systems will be developed which don’t follow these laws. While they have pervaded science fiction and are referred to in many books, films, and other media, they do little to guide a national strategy towards protecting us from rouge AI. Bostrom proposes regulatory approaches such as pre-programming a solution to the “control problem” of how to prevent the superintelligence from wiping out humanity or instilling the superintelligence with goals that are compatible with human survival and well-being. He also proposes research be guided and managed within a strict ethical framework. Stephen M. Omohundro writes that intelligent systems will need to be carefully designed to prevent them from behaving in harmful ways and proposes developing a universal set of values in order to establish a “friendly AI”.

The Machine Intelligence Research Institute has the mission of ensuring that the creation of smarter-than-human intelligence has a positive impact. They are conducting research to ensure computers can reason coherently about their own behavior and are consistent under reflection, trying to formally specify an AI’s goals in order to ensure such that the formalism matches their designer’s intentions and considering how to ensure those intended goals are preserved even as an AI modifies itself. These are worthy and interesting research goals, but if this science is formally developed, it will only ensure that benevolent designers will produce safe systems.

These approaches fail to consider the difficulty of accounting for unintended consequences that occur when goals are translated into machine-implementable code. A strategy that relies on compliance also fails to account for malicious actors. We should develop the best and most diverse AI possible to both protect us from malicious activity wether it is human directed or not. Such a strategy accounts for Bostrom’s main point that the first superintelligence to be created will have decisive first-mover advantage and, in a world where there is no other system remotely comparable, it will be very powerful. Only a diverse array of AI could counter such a threat.

Fortunately, since we are still dealing with a hypothetical, there is time to explore mitigation options as AI develops. I also agree with Bostrom’s Oxford University colleagues who suggest that nuclear war and the weaponization of biotechnology and nanotechnology present greater threats to humanity than superintelligence. For our lives, there is much greater danger of losing your job to a robot than losing your life. Perhaps the greatest threat is an over-reaction to AI development which prevents us from developing the AI needed to solve our hardest problems.

Some additional reading

November 10, 2014 By Tim Booher 2 Comments

Personal Projects & Interests

Evolution, Faith and Modernity

Tonight, I just finished Francis Collins’ book “The Language of God” where he lays out the basic facts of genetics and the human genome, denounces Creationism and rejects Intelligent Design theory, rebukes Richard Dawkins, and generally sets a tone for reasonableness between Christians and scientists.

This is a lot of disjoint topics, and while he covers a lot of territory, he doesn’t provide sufficient depth in any one area to change minds on either side of the debate of these issues. His goal is clearly to get Christians to think and synchronize their beliefs with modern science.

There is much compelling in this book. As a Christian, I want to be honest and consistent, not just with others, but with myself. I don’t want to hold on to beliefs that are not in agreement with my principles and don’t derive from what I consider to be authority.

So what determines what I believe? Logic and trust, experience and faith. At a basic level, my beliefs are the result of the information I’ve received and how I’ve processed it. While, this sounds decidedly materialist, as a Christian, the Holy Spirit is an important input. Looking back, I would have to put these in the following order:

External Conversations (especially honest conversations with friends) – this is why you should surround yourself with the very best people, and listen to them
Internal Conversations (Reflections) (times I’m with books, praying, writing posts like this)

In these, I certainly consider arguments of reason to be critical, but I’m sufficiently aware that I have neither the time nor ability to form all my beliefs from my logic alone. Some would say this is a lack of moral courage, “think for yourself, Tim”, but I hold to a classical view of faith, extending lots of trust to the organizations I join to teach me the right things. This doesn’t mean I turn my mind off in church, but I approach things there with trust. Even in technical lectures, I’m generally there trusting the professor, not scoffing at her equations. I’m there trying to figure out what they are saying, under the trust that the school has vetted the professor and the scientific community has vetted the textbook. Perhaps this is best summarized with a “trust, but verify” mindset.

Here we get to the heart of Dr Collins’ book. We can’t derive everything from first principles. For me, I would say only a small fraction of my beliefs are from first principles, other things just ‘seem’ to work and I trust experience. Other things I just trust other folks on. Take a statement like “computers read and process information”. I believe this. I use computers all the time. I’ve even build logic out of Boolean circuit components, I’ve done the physical chemistry of n- and p-type junctions of transistors, but at some level I just trust that x86 processors work, even if at some point long ago, I thought through how an ALU works.

We conservative Christians have a problem. We love the consistency, products and output of science, but the science of origins has taken on theology all its own. In particular, there is now a vocal group of public intellectuals claiming they are creatures of reason and that faith and trust has no place, deriving all beliefs and forming moral judgements from the scientific method and falsifiable data. Their most popular argument is an appeal to fairness: why are your beliefs superior to ancient sun-worshipers or crazy people when you have no data to bring to the table? To oppose them counters currently accepted notions of equality. (The argument goes: “Who opposes equality but bigots and elitists? And if you don’t oppose equality, than how can you say your faith is more valid then someone else? Only data are objective. Faith is not.”)

Christians want to trust the scientific community and love the Christian scientific heritage, but our faith is precious to us and we have both experienced God and His forgiveness and place trust in His specific (i.e. Bible) and general revelation (i.e. experience of the natural world). From our own inability to control our own moral state and actions, we know we need accountability and we find great comfort in Biblical and ecclesiastical answers to the big questions. I also find comfort in not needing to arbitrate all the answers myself. Both the history of the Church and the Christian community I have is there to teach me and help me navigate life.

I value all these things, but what do I believe and why? Several weeks ago, it was helpful for me to fill out an excel spreadsheet with my beliefs. I put statements like “We live in a causal world” next to “God created the world” and categorized them by my level of certainity. I’m sure there is a better list, but I put a checkbox to see if each of the following categories supported a specific belief:

Basic Reason
Testimony of Natural World
Personal Experience
Bible
Historical Evidence
Trusted Friends
Scientific Community
The Church

I’m sure this is a poor list, but I wanted to get started. So for something like: “I exist” or “my wife is an amazing woman”, I would check personal experience and basic reason– I both know these to be true intuitively and I can give you lots of evidence why. For “the soul is immortal” I check off the church, Bible, and trusted friends. Wow, much to argue about here, but this was just an experiment to get me thinking.

Now, I’m not a philosopher, but I’m interested in Dr Collins’ central question: how can modern Christians accept authority from Bible, Church and the Scientific Community?

In order to make this work, Collins argues that faith (specifically Christian faith) is reasonable for a modern smart scientist, that the current consesus of the scientific community regarding origins is a “hands off” process of natural selection, and the Christan view to syncronize scripture is to accept (1) God started things, but didn’t guide them, (2) certain parts of scripture are “clearly” poetic and not indended to be taken literaly and (3) put faith in the smallest part possible in your understanding of the natural world, but at least allow for the possibilities for miracles to exist.

In short: trust your “scientific” part of your mind as the primary arbiter for your beliefs, but allow for faith as well, at least where it is reasonable. Then, place these two systems of belief in separate spheres where they can each answer their respective questions.

At first glance, this seems excellent. Can I really confine science and religion to operate in largely separate spheres, the natural and the supernatural, so that most instances of supposed conflict are actually misunderstandings or misapplications of one or the other? To Collins, the error is when ‘faith trumps science’ or when ‘science trumps faith’. His ideal is an egalitarian view: two healthy determinants of belief, both equal and valid.

Can I take control of scripture and start discounting the parts that don’t seem to make sense to me as poetry? Can I trust the scientists to tell me what to believe on origins like I trust the doctor to tell me what medicine to take? Would separating my faith in God and science be a peaceful coexistence, or would it be more like one hand on the oven and another in the freezer.

While I found his dialogue pretty convincing, his broad brush approach left a lot of issues unresolved. Accepting this book requires accepting the following conclusions I still can’t accept:

Adam and Eve were not the first people. “Genetic evidence shows that humans descended from a group of several thousand individuals who lived about 150,000 years ago.” He presents options such as accepting they were two individuals chosen from many to represent humanity or that the names Adam and Eve were a symbol for humanity. My biggest issue with this is that Paul believed in a literal Adam and Eve (cf Romans 5 and 1 Corinthians 15), so to accept this is to now say that Paul was might have been right on spiritual matters, but didn’t understand origins, or was a “product of his time”. This is a radical departure from traditional hermeneutics.
Death pre-existed the fall. He claims the death that is discussed in the Bible resulting from Adam’s sin is a spiritual death. This is contrary to what I’ve been taught, but I’m willing to consider it.
God was only involved in the smallest, initial component of creation. He implies the Creator must have been ‘clumsy’ to have to keep intervening throughout geologic time to make his creatures turn out right. Collins finds it more elegant to confine God to setting things up and then taking a hands off approach. However, this contradicts even a poetic reading of Genesis, and is much closer to a blind watchmaker than I’m comfortable with.

This is all so disappointing, because I wanted this book to define my views on this issue, but I can’t get there. Francis went through the CS Lewis program several years before I did and we have several friends in common. He is clearly in the Christian camp, but he wants the benefits of dogmatism, but tries hard to avoid dogmatism at every turn. Most disappointing, he writes in the end that he shares his faith “without the desire to convert or proselytize you” because he sees values in all faiths. What is more hollow (and logically inconsistent) than someone who doesn’t sufficiently believe his faith should apply to others? Throughout the book, he is always hedging and tries very hard to stay clear of making any claims of Christian moral superiority. God is reduced (without Collins meaning to do so) to little more than the author of natural laws. And the end result of his logic is to make the Universe appear, to the objective observer, to be unsupervised.

Despite his stature and appeal to the authority of the scientific community, he never really gets me to molecules-to-man evolution, for which Collins has provided no new arguments that I could find. While I admire his defense of the Kantian tradition: where the empirical and the spiritual happily co-exist, this book doesn’t clear up my confusion. He merely confirmed what I already knew: a lot of smart people, historical Christians, and the vast majority of academics/scientists believe that evolution was the process by which man and woman were formed. While he is in favor of a semi-literal interpretation of most of the bible, he only makes halfhearted attempts to convince the reader of his position, and, astoundingly, never explains exactly what he thinks Scripture is and how he extracts truth from it.

One key takeaway for me was the importance of working this out. As a Christian and a modern man, I need to have a thought out position on this that is logically consistent and reflects my principles and key tenants. So, if I’m not with him, am I ready to join the institute for creation research and head off to the creation museum to sort this all out for me?

As much as I found his position unsatisfying, I’m even more uncomfortable with the young earth creationists. They violate the principle of inserting certainty where it shouldn’t belong. They can’t explain the age of starlight, the consistent results of carbon/radioactive dating, ice layers or even tree rings that contradict their age of the earth. Moreover, they do stand in opposition to the scientific community. Period. Science is a community that is obsessed with truth and its members are incentivized by data-driven arguments, especially those that are novel and iconoclastic. While it is unfortunate they rule out the possibility of a God created worldview, they would at least have to admit that the evidence supports a young earth, but, alas, it does not. You can find scientific-looking articles, but the ones I’ve seen neither use real data, nor are written by folks I would call real scientists. While I deplore appeals to authority that most current scientific debate follows, the “creation research” that I can find does not withstand basic scrutiny, other than its ability to make the true point that no-one knows what happened at the beginning of time. Starting with (and staying with) the bias that any conclusions reached will not interfere with a current set of interpretations of Genesis, might be a valid framework of belief, but we should not call that process scientific discovery.

I’m also inclined to believe that Genesis is not meant to teach scientific information. I read passages such as Psa 139:13, “you knit me together in my mother’s womb”, as containing moral truth (e.g. God personally created us, not random forces), but I do not extract any conclusion from that passage that the creation of life involves the mechanics of knitting. I believe in the fundamental truth of the Bible, but I don’t think we, for example, should read that the sun stopped in the middle of the sky and delayed going down about a full day and start revising astronomy. Yes, that was a miracle, but in the end, I have to synthesize the specific and general revelations and believe that the world we can explain is constant, consistent and causal. Creation was itself a miracle after all. Choosing my interpretation of scripture when evidence is contrary to scripture is to ignore the testimony of general revelation. The only way to hold this position is to accept that God deliberately created “clues” found in the data of the world that are inconsistent with reality. Yes, the possibility exists that the world could look old and actually be young, but is this consistent with general revelation and with how God works? If we are pushed to a place where our best argument is that the natural world could be manipulated to be different than reality, we have traded the regularity of the natural world for something completely chaotic and need to remember what Sherlock Holmes said on this:

‘It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.’

In the end, I have to go with what I know: God is good, the world is real, and I’m not God. The synthesis of that for me is what Tim Keller calls “the messy approach” and admitting that I just don’t know what happened at the beginning. My faith tells me Adam and Eve were real and willingly sinned. The testimony of the natural world tells me the world is old. Can I rename my views on this the “humble, faithful and honest approach”. I’m open to new data, but I can’t find any comfort in another view. Since, I’m basically with Tim Keller on this, I’ll give you his quote:

The fact is, the one that most people consider the most conservative, which is the young-Earth, six-day creation, has all kinds of problems with the text, as we know. If it’s really true, then you have problems of contradictions between Genesis 1 and 2. … I don’t like the theory that these are two somewhat contradictory creation stories that some editor stuck together…I think therefore you’ve got a problem with how long are the days before the sun shows up in the fourth day. You have problems really reading the Bible in a straightforward way with a young-Earth, six 24-hour day theory. You’ve got some problems with the theistic evolution, because then you have to ask yourself, “Was there no Adam and Eve? Was there no Fall?” So here’s what I like-the messy approach, which is I think there was an Adam and Eve. I think there was a real Fall. I think that happened. I also think that there also was a very long process probably, you know, that the earth probably is very old, and there was some kind of process of natural selection that God guided and used, and maybe intervened in. And that’s just the messy part. I’m not a scientist. I’m not going to go beyond that.

If you’ve made it this far, I leave you with a quote from C.S. Lewis who was so foundational to Collins’ faith. In the meantime, I’ve got work to do and a God to serve . . .

“If by saying that man rose from brutality you mean simply that man is physically descended from animals, I have no objection. But it does not follow that the further back you go the more brutal—in the sense of wicked or wretched—you will find man to be.”

“For long centuries God perfected the animal form which was to become the vehicle of humanity and the image of Himself. He gave it hands whose thumb could be applied to each of the fingers, and jaws and teeth and throat capable of articulation, and a brain sufficiently complex to execute all the material motions whereby rational thought is incarnated. The creature may have existed for ages in this state before it became man: it may even have been clever enough to make things which a modern archaeologist would accept as proof of its humanity. But it was only an animal because all its physical and psychical processes were directed to purely material and natural ends. Then, in the fullness of time, God caused to descend upon this organism, both on its psychology and physiology, a new kind of consciousness which could say ‘I’ and ‘me,’ which could look upon itself as an object, which knew God, which could make judgments of truth, beauty, and goodness, and which was so far above time that it could perceive time flowing past.”

“I do not doubt that if the Paradisal man could now appear among us, we should regard him as an utter savage, a creature to be exploited or, at best, patronised. Only one or two, and those the holiest among us, would glance a second time at the naked, shaggy-bearded, slow spoken creature: but they, after a few minutes, would fall at his feet.” — C.S. Lewis, “The Problem of Pain”

Amana 7200TW Door Latch Replacement

Quick post, just to help others who might find themselves in a similar situation regarding a front-end loader washer with a door latch that doesn’t close

Big trouble . . . door latch broke on our amana front end loader. First thing I do when I need to fix something is look for ownership so I can decide which part to get. A quick appliance 411 showed me that they are owned by Maytag which has been part of Whirlpool since 2002. This problem bothered my wife for months as she put heavy stuff in front of the washer. From extensive travel and work, I just found out tonight that I needed to get this fixed, stat.

First thing to do . . . get an exact part number, so I took a picture of the model number from the back.

picture from back

Yes, I could read this and type it in search engines, or I could just push through an online ocr tool like http://www.onlineocr.net/ and get:

CLOTHES WASHER AMANA APPLIANCES: BENTON HARBOR, MI, USA ASSEA MODEL NO. NFW7200TW 120AV. -60Hz 5.5 SERIAL NO. 11590928PA 2009.01

Several months ago, we ordered this, but appliancepartspros sent us a slightly different part and I had to modify it with an angle grinder to remove the bevel.

whirlpool-lever-door-34001260-ap4044741_01_m

In the end this didn’t work at all, which was incredibly frustrating. So tonight, I did a new web-search and found this video on youtube, which convinced me to replace the receiver, which looks like this:

Screen Shot 2014-10-11 at 9.04.23 PM

I found this part cheaper on amazon here and found a nice summary of all parts here.

Importantly, I found the rosetta stone below that let me know that the best part number to search on was 8182634.

Part Number 8182634 (AP3837611) replaces 34001265, 8181700, 1094190, AH972231, DC64-00519B, EA972231, PS972231.

They also had another good video here and I confirmed using Appliance Compatibility Tool that “Yes- This high quality, original factory replacement part is compatible with NFW7200TW10.”

I didn’t want to be delayed by missing the correct strike, so I found the Whirlpool 8181651 Door Strike here (it seemed the cheapest) and quickly bought it for ~$5.

The whole fix took about 30 minutes. The most difficult part was keeping track of the screws. Some of the front screws were really tight and I had to use a drill. The top took a lot of tugging and pulling to get off.

October 12, 2014 By Tim Booher 6 Comments

Engineering & Code

Shellshock: Bashing Bash for Fun and Profit

The latest fundamental computer security vulnerability, termed Shellshock, was discovered by a vulnerability researcher Stephane Chazelas (a linux shell expert living in the UK) which allows for arbitrary code execution on linux or mac computers through creating a custom environment variable. If you haven’t already, you need to patch your system(s) and you might be hearing a lot more about this in the near future.

Since it is common for a lot of programs to run Bash shell in the background, a number of devices (the “internet of things”) may be vulnerable. I’m used to really complicated exploits, but this is really a one-liner:
env x='() { :;}; echo vulnerable' bash -c "echo this is a test"
Because of this simplicity there is massive speculation of an imminent arrival of a worm that can traverse multiple vulnerable systems.

Why does this matter?

Bash is everywhere — particularly in things that don’t look like computers (like pacemakers or cars). It is a UNIX like shell, which is on nearly every Linux system. From its creation in 1980, bash has evolved from a simple terminal based command interpreter to many other fancy uses, particularly as linux became present in our phones, cars and refrigerators.

An exploit that operates at this level will be lurking around in all various and sundry sorts of software, both local and remote. Embedded devices often have web-enabled front-ends to shuttle user input back and forth via bash shells, for example — routers, SCADA/ICS devices, medical equipment, and all sorts of web-connected gadgets are likely to be exposed. Additionally linux distributions and Mac OS are both vulnerable–even though the major attack vectors that have been identified up to this point are HTTP requests and CGI scripts.

This all happens because bash keeps executing after processing function definitions; it continues to parse and execute shell commands following a function definition. Especially problematic is the ability for environment variables with arbitrary names to be used as a carrier for a malicious function definitions which containing trailing commands. In particular, this enables network-based exploitation and therefore propagation on a large scale. To get some feel for how easy this can propagate see the example below where a simple wget (just a request for a web page) executes this in one line:

wget -U "() { test;};/usr/bin/touch /tmp/VULNERABLE" myserver/cgi-bin/test

How does it work?

The NIST vulnerability database gives the flaw 10 out of 10 in terms of severity and provides the short, but dense, description:

GNU Bash through 4.3 processes trailing strings after function definitions in the values of environment variables, which allows remote attackers to execute arbitrary code via a crafted environment, as demonstrated by vectors involving the ForceCommand feature in OpenSSH sshd, the mod_cgi and mod_cgid modules in the Apache HTTP Server, scripts executed by unspecified DHCP clients, and other situations in which setting the environment occurs across a privilege boundary from Bash execution.
Authentication: Not required to exploit
Impact Type: Allows unauthorized disclosure of information; Allows unauthorized modification; Allows disruption of service

Let’s unpack this, because it didn’t make sense to me on a first read.

The key insight is that Bash supports exporting not just shell variables, but also shell functions to other bash instances. Bash uses an environment variable named by the function name, and a function definition starting with “() {” in the variable value to propagate function definitions. What should happen here is that bash should stop after processing the function definition, however, it continues to parse and execute shell commands after the function definition. To make this concrete, assume that you set an environment variable:

VAR=() { ignored; }; /bin/exploit_now

This will execute /bin/exploit_now when the environment is imported into the bash process. The fact that an environment variable with an arbitrary name can be used as a carrier for a malicious function definition containing trailing commands makes this vulnerability particularly severe; it enables network-based exploitation.

But, how could I be vulnerable?

On first blush, why should you care? Your are not giving shell access to strangers. To characterize the initial vulnerability space, let’s look at web applications. Web applications connect most all the networks in existence and form a large part of our digital economy, but the big problem is that web applications aren’t limited to web-browsers. Your router at home has a web-server as do other embedded devices (such as my thermostat).

When one requests a web page via the HTTP protocol, a typical request looks like this:

GET /path?query-param-name=query-param-value HTTP/1.1 Host: www.mysite.com Custom: some header value
The CGI specification maps all parts to environment variables. With Apache httpd, the magic string “() {” can appear in the values above. An environmental variable with an arbitrary name can carry a nefarious function which can enable network exploitation.

A (little) bit deeper

To understand a little more of what happens, consider the following:

$ env x='() { :;}; echo pwned' bash -c "echo this is a test"

The result above is because the parsing of function definitions from strings (which in this case are environment variables) can have wider effects than intended. Vulnerable systems interpret arbitrary commands that occur after the termination of the function definition. This is due to insufficient (or non-existent) constraints in the determination of acceptable function-like strings in the environment.

If the function defined in x does some sneaky underhanded work, there is no way that bash can check if the return value if function x is real. Notice the function is empty above. An unchecked return value can lead to script injection. Script injection leads to privilege escalation and privilege escalation leads to root access. To fix this, most patches will disable the creation of x as a function.

Good luck, and I hope this doesn’t overly burden the well-meaning folks out there who just want computers to work securely so we can do our jobs. Right now the race is on. Vulnerability was found yesterday, exploits showed up this morning. Metasploit already has a plugin. I hope your patch gets there before the exploits do. In any case, the smart money is on the arrival of a full blown worm in the next several days. Get ready.

Some deeper (better) reading:

https://securityblog.redhat.com/2014/09/24/bash-specially-crafted-environment-variables-code-injection-attack/
http://www.theregister.co.uk/2014/09/24/bash_shell_vuln/

September 25, 2014 By Tim Booher 0 Comments

Engineering & Code

Playing with Matched Filters

During my time on the red team, we continually discussed the role of matched filters in everything from GPS to fire control radars. While I’m having a blast at DARPA where I work in cyber, I wanted to review an old topic and put MATLAB’s phased array toolbox to the test. (Yes, radar friends this is basic stuff. I’m mostly writing this to refresh my memory and remember how to code. Maybe a fellow-manager might find this review helpful, but if you are in this field, there won’t be anything interesting or new below.)

Why use matched filters?

Few things are more fundamental to RADAR performance than the fact that probability of detection increases with increasing signal to noise ratio (SNR). For a deterministic signal in white Gaussian noise (a good assumption as any regarding background noise but the noise does not need to be Gaussian for a matched filter to work), the SNR can be maximized at the receiver by using a filter matched to the signal.

One thing that always confused me about matched filters was that they really aren’t a type of filter, but more of a framework that aims to reduce the effect of noise which results in a higher signal to noise ratio. One way I’ve heard this described is that the matched filter is a time-reversed and conjugated version of the signal.

The math helps to understand what is going on here. In particular, I want to derive that the peak instantaneous signal power divided by the average noise power at the output of a matched filter is equal to twice the input signal energy divided by the input noise power, regardless of the waveform used by the radar.

Suppose we have some signal $r(t) = s(t) + n(t)$ where $n(t)$ is the noise and $s(t)$ is the signal. The signal is finite, with duration $T$ and let’s assume the noise is white gaussian noise with spectral height $N_0/2$. If the aggregated signal is input into a filter with impulse response $h(t)$ and the resultant output is $y(t)$ you can write the signal and noise outputs ($y_s$ and $y_n$) in the time domain:

$$ y_s(t) = \int_0^t s(u) h(t-u)\,du $$
$$ y_n(t) = \int_0^t n(u) h(t-u)\,du $$

Since we want to minimize the SNR, we expand the above:

$$\text{SNR} = \frac{y_s^2(t)}{E\left[y_n^2(t) \right]}$$
$$ = \frac{ \left[ \int_0^t s(u) h(t-u)\,du \right]^2}{\text{E}\left[ \int_0^t n(u) h(t-u)\,du \right]^2}$$

The denominator can be expanded:

$$\text{E} \left[y_n^2(t) \right] = \left[ \int_0^t n(u) h(t-u)\,du \int_0^t n(v) h(t-v)\,dv \right] $$

$$ \int_0^t \int_0^t E [ n(u) n(v) ] h(t-u) h(t-v) du\,dv $$

We can further simplify this by invoking a standard white noise model:

$$ E[y_n^2] = \frac{N_0}{2} \int_0^t \int_0^t \delta(u-v) h(t-u) h(t – v) du\,dv $$

Which simplifies nicely to:

$$ \frac{N_0}{2} \int_0^t h^2 (t – u) du $$

Now all together we get:

$$ SNR = \frac{ \left[ \int_0^t s(u) h(t-u)\,du \right]^2 }{\frac{N_0}{2} \int_0^t h^2 (t – u) du } $$

In order to further simplify, we employ the Cauchy-Schwarz Inequality which says, for any two points (say $A$ and $B$) in a Hilbert space,

$$ \langle A, B \rangle^2 \leq |A|^2 |B|^2 \text{,}$$

and is only equal when $A = k\,B$ where $k$ is a constant. Applying this, we can then look at the numerator:

$$ \left| \int_0^t s(u)\,q(u) du \right|^2 \leq \int_0^t s^2(u) du \int_0^t q^2(u) du $$

and equality is acheived when $k\,s(u) = q(u)$.

If we pick $h(t-u)$ to be equal to $k\,s(u)$, we can write our optimal SNR as:

$$ SNR^{\text{opt}} (t) = \frac{k \left[ \, \int_0^t s^2 (u) duN \right]^2 }
{
\frac{N_0 k^2}{2}
\int_0^t s^2(u) du
} = \frac{\int_0^t s^2(u) du
}{
N_0/2
}$$

Since $s(t)$ always has a finite duration $T$, then SNR is maximized by setting $t=T$ which provides the well known formula:
$$SNR^{\text{opt}} = \frac{\int_0^T s^2(u) du}{N_0/2} = \frac{2 \epsilon}{N_0}$$

So, what can we do with matched filters?

Let’s look at an example that compares the results of matched filtering with and without spectrum weighting. (Spectrum weighting is often used with linear FM waveforms to reduce sidelobes.)

The most simple pulse compression technique I know is simply shifting the frequency linearly throughout the pulse. For those not familiar with pulse compression, a little review might be helpful. One fundamental issue in designing a good radar system is it’s capability to resolve small targets at long ranges with scant separation. This requires high energy, and the easiest way to do that is to transmit a longer pulse with enough energy to detect a small target at long range. However, a long pulse degrades range resolution. We can have our cake and eat it to if we encode a frequency change in the longer pulse. Hence, frequency or phase modulation of the signal is used to achieve a high range resolution when a long pulse is required.

The capabilities of short-pulse and high range resolution radar are significant. For example, high range resolution allows resolving more than one target with good accuracy in range without using angle information. Many other applications of short-pulse and high range resolution radar are clutter reduction, glint reduction, multipath resolution, target classification, and Doppler
tolerance.

The LFM pulse in particular has the advantage of greater bandwidth while keeping the pulse duration short and envelope constant. A constant envelope LFM pulse has an ambiguity function similar to that of the square pulse, except that it is skewed in the delay-Doppler plane. Slight Doppler mismatches for the LFM pulse do not change the general shape of the pulse and reduce the amplitude very little, but they do appear to shift the pulse in time.

Before going forward, I wanted to establish the math of an LFM pulse. With a center frequency of $f_0$ and chirp slope $b$, we have a simple expression for the intra-pulse frequency shift:

$$
\phi (t) = f_0 \, t + b\,t^2
$$

If you take the derivative of the phase function, instantaneous frequency is:

$$ \omega_i (t) = f_0 + 2\,b\,t. $$

For a chirp pulse width in the interval $[0, T_p]$, $\omega_i(0) = f_0$ is the minimum frequency and $\omega_i(T_P) = f_0 + 2b\,T_P$ is the maximum frequency. The sweep bandwidth is then $2\,bT_p$ and if the unit pulse is $u(t)$ a single pulse could be described as:

$$ S(t) = u(t) e^{j 2 \pi (f_0 t + b t^2)} \text{.}$$

I learn by doing, so I created a linear FM waveform with a duration of 0.1 milliseconds, a sweep bandwidth of 100 kHz, and a pulse repetition frequency of 5 kHz. Then, we will add noise to the linear FM pulse and filter the noisy signal using a matched filter. We will then observe how the matched filter works with and without spectrum weighting.

Which produces the following chirped pulse,

lfm1

From here, we create two matched filters: one with no spectrum weighting and one with a taylor window. We can see then see the signal input and the matched filter output:

matched_filter_in_and_out

To really see how this works we need to add some noise:

% Create the signal and add noise.
sig = step(hwav);
rng(17)
x = sig+0.5*(randn(length(sig),1)+1j*randn(length(sig),1));

And we can see the impact noise has on the original signal:

input_plus_noise

and the final output (both with and without a Taylor window):

The Ambiguity Function

While it is cool to see the matched filter working, my background is more in stochastic modeling and my interest is in the radar ambiguity function — which is a much more comprehensive way to examine the performance of a matched filter. The ambiguity function is a two-dimensional function of time delay and Doppler frequency $\chi(\tau,f)$ showing the distortion of a returned pulse due to the receiver matched filter due to the Doppler shift of the return from a moving target. It is the time response of a filter matched to a given finite energy signal when the signal is received with a delay $\tau$ and a Doppler shift $\nu$ relative to the nominal values expected by the filter, or:

$$
|\chi ( \tau, \nu)| = \left| \int_{-\infty}^{\infty} u(t)u^* (t + \tau) exp(j 2 \pi \nu t) dt \right| \text{.}
$$

What is the ambiguity function of an uncompressed pulse?

For an uncompressed, rectangular, pulse the ambiguity function is relatively simple and symmetric.

ambigFun

What does the ambiguity function look like for the LFM pulse described above?

If we compare two pulses, each with a dutycycle of one (PRF is 20 kHz, and pulsewidth is 50 µs), we can see their differing ambiguity functions:

pulse_comparisons

If we look at the ambiguity function of an LFM pulse with the following properties:

SampleRate: 200000
        PulseWidth: 5e-05
               PRF: 10000
    SweepBandwidth: 100000
    SweepDirection: 'Up'
     SweepInterval: 'Positive'
          Envelope: 'Rectangular'
      OutputFormat: 'Pulses'
         NumPulses: 5

then we can see how complex the surface is:

References

http://www.ece.gatech.edu/research/labs/sarl/tutorials/ECE4606/14-MatchedFilter.pdf
Matlab help files

September 23, 2014 By Tim Booher 0 Comments

« Prev
1
2
3
…
10
11
12
…
18
19
20
Next »

Why do we care?

References

PRF

When is my board?

How do I know if I’m eligible?

When is my PRF due? When does it have to be signed and where does it need to be delivered to?

How do I review (and potentially change my records)?

The Profound Event

The Funny Event

A Profound Geeky Economic Event

Tech Awe-Inspiring Event

Removing duplicates

Some additional reading

Further Reading

Why does this matter?

How does it work?

But, how could I be vulnerable?

A (little) bit deeper

Some deeper (better) reading:

Why use matched filters?

So, what can we do with matched filters?

The Ambiguity Function

What is the ambiguity function of an uncompressed pulse?

What does the ambiguity function look like for the LFM pulse described above?

References