[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Scientific processors

To: Denis A Nicole <dan@xxxxxxxxxxxxxxx>
Subject: Re: Scientific processors
From: Roger Shepherd <rog@xxxxxxxx>
Date: Wed, 25 Nov 2020 10:42:36 +0000
Archived-at: <https://lists.kent.ac.uk/sympa/arcsearch_id/occam-com/2020-11/D14FBD38-1CA4-4CCD-A4CD-E8AD11DF3C45%40rcjd.net>
Cc: Larry Dickson <tjoccam@xxxxxxxxxxx>, Øyvind Teig <oyvind.teig@xxxxxxxxxxx>, Tony Gore <tony@xxxxxxxxxxxx>, Ruth Ivimey-Cook <ruth@xxxxxxxxxx>, occam-com <occam-com@xxxxxxxxxx>, Uwe Mielke <uwe.mielke@xxxxxxxxxxx>, David May <David.May@xxxxxxxxxxxxx>, Michael Bruestle <michael_bruestle@xxxxxxxxx>, Claus Peter Meder <claus.meder@xxxxxxxxxxxxxx>
Delivery-date: Wed, 25 Nov 2020 10:43:07 +0000
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rcjd.net; s=rcjd; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=oZ0sd8W3p45XOzHmX80VHsuWF3IcgNeuy8NJHkEad4E=; b=kQ34e38xDowaJhY2QpbRjJfFLLMe7y/e7y2sffibn72kWS5tkvJ5jv5pztzTB7gzI8 ghz3nPwV3ltprn4wBMyvbnclWO8b7exovpk7ZiBQOKkf+/he5FnTuXSpYylBuYawcyV0 FyAIvR7uNvjIaOsV1rg0iyIPwFXGkZymXNQso=
Envelope-to: ats@xxxxxxxxx
In-reply-to: <ecae4c9a-a974-e22f-0472-da0bac10f948@ecs.soton.ac.uk>
List-archive: <https://lists.kent.ac.uk/sympa/arc/occam-com>
List-help: <mailto:sympa@kent.ac.uk?subject=help>
List-id: <occam-com.kent.ac.uk>
List-owner: <mailto:occam-com-request@kent.ac.uk>
List-post: <mailto:occam-com@kent.ac.uk>
List-subscribe: <mailto:sympa@kent.ac.uk?subject=subscribe%20occam-com>
List-unsubscribe: <mailto:sympa@kent.ac.uk?subject=unsubscribe%20occam-com>
References: <C873035A-2E4D-4079-A7BA-D02635B6558E@tjoccam.com> <2C8378F8-1237-46E9-A9A0-E6034B85C050@tjoccam.com> <7DE3278B-D84D-4CEA-B90F-75FFB28D7D57@rcjd.net> <EF490EDC-611F-44BD-879B-95923FB47496@teigfam.net> <6364CB26-883F-4B91-88B7-997DDCC49760@teigfam.net> <6A533325-949F-425A-9A3B-0400B3CE4F7D@tjoccam.com> <VI1PR05MB5903F0DB8738AF40CABF60D3E0FF0@VI1PR05MB5903.eurprd05.prod.outlook.com> <C9038418-4AD5-4DB3-A7AC-2C9799242792@tjoccam.com> <6E379737-0600-45AF-BFC5-073A3526C2C2@rcjd.net> <F881DCB9-A9EA-468F-88E3-CDA3CB457FBF@tjoccam.com> <DBE32399-FF55-4F22-A847-A37F2E5DF3C1@rcjd.net> <2222a3c3-4e6f-cfee-e5ce-24c65b1ee06d@ivimey.org> <05D297F6-7933-4349-876B-573D7A26D1DD@tjoccam.com> <VI1PR05MB590305A50446979450C1E584E0FB0@VI1PR05MB5903.eurprd05.prod.outlook.com> <82C57044-03D3-4894-A0F5-7A6A8FACE183@teigfam.net> <15371949-5B0F-47DB-BD26-DDE91EE9ED85@tjoccam.com> <22F803FF-B74C-4CA0-B0F5-299C79A95DE3@rcjd.net> <A8FEFDFC-CFD6-4959-864D-572C371CCA68@tjoccam.com> <ecae4c9a-a974-e22f-0472-da0bac10f948@ecs.soton.ac.uk>
Reply-to: Roger Shepherd <rog@xxxxxxxx>
Sender: occam-com-request@xxxxxxxxxx

Denis,

Regarding bit-exactness.

On 25 Nov 2020, at 09:28, Denis A Nicole <dan@xxxxxxxxxxxxxxx> wrote:
3. In most systems, the real load is taken by the floating point units. The IEEE standard is important here for several reasons.

Floating point arithmetic is famously not associative. This heavily restricts the optimisations which can be performed while retaining bit-for-bit identical results. You either accept that the answers can change, write your code very carefully to pre-implement the optimisations, or go slow.

Non-associativity is a problem. It’s why these two pieces of code can give different results for the value sum

sum = 0.0
for i = 0 for x.size
sum = sum + x[I]

and

sumFirstPart = 0.0
for i = 0 for x.size/2
sumFirstPart = sum + x[I]

sumSecondPart = 0.0
for i = x.size/2 for (x.size - x.size/2)
sumSecondPart = sumSecondPart + x[I]

Sum = sumFirstPart + sumSecondPart

The second code segment shows exactly what you want to do to perform the two partial sums in parallel. I believe that in some real world systems the decisions about parallelisation get made at run-time, depending on what the computer is doing at the time, and so different runs on identical data give rise to different results.

4. Getting bit-for-bit matching answers from consecutive runs is really difficult. Obviously, we need to seed our PRNGs consistently, but there are all sorts of internal code and optimisation issues that break things. This leads to real difficulty in verifying benchmark tests. Overlaid on this are sometimes genuine instabilities in important scientific codes. For example, global ocean models can be very hard to "spin up"; you need exactly the right initial conditions or the circulation never converges to that of the planet we know. This may not even be a problem in the models; perhaps the real equations of motion have multiple fixed points? There are similar difficulties in molecular dynamics around hydrogen bonding. Sadly, that is the case we care about most; it covers protein folding in hydrated systems.

The non-associativity of f.p. arithmetic is the cause of many problems. Is the repeatability problem you mention due to effects other than this?

Roger

--

Roger Shepherd

Attachment: signature.asc
Description: Message signed with OpenPGP

Follow-Ups:
- Re: Scientific processors
  - From: Denis A Nicole

References:
- Transistor count
  - From: Larry Dickson
- Re: Transistor count
  - From: Larry Dickson
- Re: Transistor count
  - From: Roger Shepherd
- Re: Transistor count
  - From: Øyvind Teig
- Re: Transistor count
  - From: Øyvind Teig
- Re: Transistor count
  - From: Larry Dickson
- RE: Transistor count
  - From: Tony Gore
- Re: Transistor count
  - From: Larry Dickson
- Re: Transistor count
  - From: Roger Shepherd
- Re: Transistor count
  - From: Larry Dickson
- Re: Transistor count
  - From: Roger Shepherd
- Re: Transistor count
  - From: Ruth Ivimey-Cook
- Re: Transistor count
  - From: Larry Dickson
- RE: Transistor count
  - From: Tony Gore
- Re: Transistor count
  - From: Øyvind Teig
- Re: Transistor count
  - From: Larry Dickson
- Re: Transistor count
  - From: Roger Shepherd
- Re: Transistor count
  - From: Larry Dickson
- Scientific processors
  - From: Denis A Nicole

Prev by Date: Re: Transistor count
Next by Date: Re: Scientific processors
Previous by thread: Scientific processors
Next by thread: Re: Scientific processors
Index(es):
- Date
- Thread