Hi Eric Some years ago I reviewed some developments on time triggered
architectures for anti-lock braking. Here each wheel acted independently. I did
some calculations on taking the bus time interval and how far a car would
travel on a high speed bend on the autobahn at 160km/h and it was around 1.5m.
I then considered what would happen on black ice if the brakes were applied to
one or opposite wheels by this time interval and decided that I would not want
to be in that car as this would probably start an uncontrollable spin. This is one example of the difference between the continuous
domain and the discrete domain. Default actions can be quite tricky to define. When I worked at
Motorola, we had the "Motocar". This was a Lancia Delta wher we took
out all the wiring (1.5km) and put a power bus and a fibre optic control bus
round the car. There were then about 30 black boxes e.g. one at each light cluster.
The question of default actions e.g. when communications failed became
interesting. Lights are easy - you put them on, as it is no real problem to
have lights on during the day, but hard and illegal to drive without lights at
night. Same with the fuel pump. The biggest question is what to do with the
windscreen wipers - default on or off? By the way, it was fun driving such a car in failure mode, as
there were no turn signals, and the UK highway code never gave hand signals for
a left hand drive car turning right and unable to open the electric windows! Tony Gore email tony@xxxxxxxxxxxx From: Eric Verhulst [mailto:eric.verhulst@xxxxxxxxxxxxxxxxxxxxxx]
Hi Ian, I might have a text (or at least the basis) for your last book. It
is an email I send to the Formal Community after FM2009, picking up a leaflet
on a survey of formal methods. I also did some investigation in car safety, and the issues are
much wider than adding the right IF statements. The root cause is incomplete requirements
analysis to start with, but mainly that the automotive people have not yet
understood the difference between the continuous and the discrete domain.
Mechanical parts are in the first domain and have a natural “graceful
degradation” property. Electronics and software are in the second domain
and can stop working in one clock cycle (say 10 nanoseconds). Even with a very
low probability of failure at the hardware level, after some time in the human
domain (e.g. driving a car for 5 years), it is a certainty that it will happen.
Even using a dual-core controller (like TI has one with ARMs) will not cover
the common mode failures. Hence our controller will be heterogeneous. My conclusions is that subsystems like traction control,
braking, etc need to be SIL4 => fault tolerance through triplication (pedal,
sensors, CPU, SW, …) and voting. If this fails your are at SIL3 (fail
safe mode), which is now the “standard” practice => often it
means they put the engine at high throttle. Still very risky at high speed on a
highway. Humans need a bit more than 20 nanosecs to take corrective action. Best regards, Eric See below the mail I sent (to John Fitzgerald). Dear FM authors, I picked up the leaflet on the FM report at
FM2009, read your excellent report, and while I contributed to the
survey, I believe that the kind of experience we have with using FM is a bit
missing from the report. In 2005 we started a project (with support
from the Flemish IWT Gov organization) to develop a distributed RTOS (OpenComRTOS)
from scratch but using FM from the beginning. (this was under the banner of
Open License Society). The purpose twofold: 1.
To learn to know the practical
use of Formal Methods. (we had no prior experience, just awareness, mostly of
CSP). 2.
To develop a “correct”
(read error-free and clean) networked RTOS. There was prior art. Virtuoso was a
similar RTOS, developed by Eonic Systems in the 90’s and sold to Wind
River in 2001. The team was composed of three people: -
Myself, mainly as project
manager and RTOS architect. -
Gjalt de Jong, PhD in
electronics but well versed in telecom and software implementor. -
Jose Faria, an engineer,
electro-mechanical engineer: he was our formal methods engineer (no prior
experience). The FM work was supervised by Prof. Raymond
Boute. The first 6 months were mostly spend
getting familiar in a rather informal way with what FM were, looking at tools,
reading some books and papers, and finally selecting a tool. We originally had planned to use SPIN (as
Promela is C-like), but Prof. Boute suggested to use TLA+/TLC. While we were
originally less enthousiastic about the mathematical notation, in the end
this might have been beneficial because it raised the level of abstraction,
expecially in the way of thinking about the problem. The way of working was that only Jose
worked full time (although he left after one year). Gjalt and myself worked in
parallel at a company managing embedded software development but with freedom
to work on our RTOS project when we wanted. The way of working was mainly
by email (Jose was at another location) and by meeting face-to-face every 2 to
4 weeks. Mainly discussing using a white board. So what happened? First of all, the original TLA models were
useless. It turned out that we were still thinking in terms of (C)
implementation. E.g. when a waiting list was to be modeled, Jose initially
wrote TLA code mimicking a linked list with pointers, etc. Turned out that the
models then exploded very fast and became rather complex. Until we realized
this and simple used a set. Another example was the discovery that waiting
lists and buffers were actually the same thing, etc. So we started again and then quickly as
well incrementally build up the models. Not of the full RTOS (that is beyond
the capabilities) but of the core functionality, like scheduling, semaphores,
events FIFOs, priority inheritance, etc. Initial models were just 2 pages, but
after some 25 iterations they could grow to a 20- 30 pages. However, between
each iteration, often only 30 minutes was needed. We ran the model until no
errors were found and then refined it, ran it again, etc. Running the models
was in minutes. I must add that the interactions of the three people involved
were very stimulating and helped a lot because each of use looked at the
problem from a different angle. At some point, we discovered that most
traditional RTOS services are actually variations on a theme. Basically each
service has a synchronisation point, a synchronization predicate and a
synchronization action. We called this generic entity a “hub”.
Another novelty was that we had consistently designed the RTOS as a
packet switching system (in a very systematic way). This process lasted about 12 months. Two
weeks later we had running code. This code was derived by hand from the models
that served as architectural design. Since, we have found errors, but most of
them were related to the underlying “HAL”, either an OS like
Windows of Linux, either a specific embedded processor. Noteworthy is that
doing this we discovered two serious “bugs” in Windows and Linux.
In Windows the “CriticalSection” system call crashed when stressed,
so did the Linux semaphore. The major surprise however was the code
size. Compared with Virtuoso we have 5 to 10 times less code for more or less
the same basic functionality. This was even more surprising as Virtuoso had a
very good reputation for size and performance, it was hand tweaked, assembler
optimized, 3 generations of the RTOS were developed and about 150 person-years
went into it. It was also very hard to port. OpenComRTOS on the other was
almost fully written in ANSI-C (Misra checked), easily ported and much more
scalable as well as safer. Code size figures: typically 5 to 10 Kbytes for a
full version (MP). An optimized SP version has been compiled to less than
1 KB. The Virtuoso RTOS typically started at 5 Kbytes (just task scheduling)
and easily went up to 70 – 80 Kbytes (typically on a 32bit machine). Once the software was stable, new Formal
models in TLA were developed from the implementation source code. This was
quite straightforward. Lessons: 1.
FM methods are not that
difficult, but they must be used as a supporting tool from the beginning when
developing the architecture and even when narrowing down the specifications. 2.
One doesn’t need large
teams and efforts to use them, although dedication is a must. 3.
The biggest gain is in the
architecture. 4.
Once a software has been
developed with FM, verification (using model checking) is easy. What we did not do, is to verify the C code
using static checkers or proofing tools. This is probably more appropriate for
numerical code. This brings me to a final conclusion. The
current dominant attitude in the FM community is still that FM are to be used
to verify software. The problem is that this can still be badly designed and
inefficient software. Hence the verification effort will also be much larger
than needed. FM methods really shine when used as a tool and method supporting
abstract thinking, together with informal techniques, pseudo-formal techniques
(like architectural diagrams), etc. I recently attended a presentation by MS on
the Hypervisor verification (60 000 lines of C ). I was astonished to hear that
each C function is more or less manually proofed with as only input for the
formal engineer the C code! I presume that if this Hypervisor had been
developed using FM from the start, it could have been a lot smaller and likely
a lot cleaner and more performant. Just image that all embedded software had
been developed with FM resulting in code sizes that are maybe 20% of what we
have now. The world would have been different. It would also have meant a
serious reduction in power consumption. It is also my opinion that with a well
trained team and using such “formalized” methodology, projects
don’t need to take more effort that the still dominant manual methods.
Hence, it can really a cost reduction and certainly a serious risk reducing
approach (the traditional methodology is very prone to run-away after the
project was officially finished). Best regards, Eric Verhulst PS 1.
In the meantime we have started
to work on the time-out functionality. Here using TLA is problematic, so we
switched to UPPAAL. We are also starting to use Frama-C and B. 2.
A paper was published at FM
2008 on this project. From: Mailing_List_Robot
[mailto:sympa@xxxxxxxxxx] On Behalf Of Ian East Hi Tony Thanks for this. Very worrying, given
I drove my family around Florida recently in a Prius! It's interesting how simple basics can get
drowned out. I suppose it happened to NASA with Challenger. One of the things that attracts me to rail,
over road, (forthcoming book The Cost of the Car) is the engineering
heritage. Safety engineering began with rail, and signalling makes a nice
student exercise in sequential system design, and, more recently, in formal
analysis. I worry whether the same art is being
applied to car design. Of course, it is entirely absent in road design,
which is inherently lethal anyway, as the annual carnage shows. And, of course, it is absent in the
training of most 'software engineers', as is engineering per se. I too began with µprocessor programming in
assembly language, and believe as much has been lost as gained in the use of
'standard' HLLs. I have had unending difficulty adapting to universal
hacking – the lack of any real specification, design, or analysis, with
software. I'm trying to assemble a series of texts
called Electronics, Systems, and Programming (http://www.openchannelpublishing.com/esp.html) that
combine a more formal approach to software, and would welcome a proposal,
particularly wrt embedded concurrent/reactive systems. (You get £2.50
back a book, instead of 80p at most, and a lower selling price, hence more
sales. And yes, that is realistic, especially with e-books on the rise.) The CPA community is still badly in need of
books that reflect its idea and ideals. The right people are always too
busy, I guess. Best Ian On 31 Mar 2010, at 03:04, Tony Gore wrote: Hi Ian See this from an engineer who used to work at Ford in around the same era
that I was working in automotive. The key phrase he uses is Although not common over here in the UK because of our penchant
for stick shift (manual gearboxes), most cars in the US have automatic
gearboxes and cruise control. One of the standard signals to cruise control is
the brake signal to disengage it. Tony Gore email tony@xxxxxxxxxxxx Ian East Open Channel Publishing Ltd. (Reg. in England, Company Number 6818450) |