Archive for the ‘First Fault Problem Solving’ Category

Review of First Fault Software Problem Solving Book

Sunday, May 2nd, 2010

c’t – Magazin für Computertechnik has published a review of First Fault Software Problem Solving book: 

http://www.heise.de/ct/inhalt/2010/08/192/ (in German)

Fabian Röken kindly translated it into English:

No single large software package comes without errors. It seems that customers simply accept this, patiently waiting and hoping for patches or updates. Skwire sticks up for a more target-aimed approach: one will never get a faultless software, but it would already be a great improvement if flaws were already solved on their first occurrence (”first fault”) and not only after a long analysis (”second fault”).

The advantages are actually obvious. However, a corresponding stringent system architecture, as common on mainframes such as IBM’s z/OS, did not become prevalent in the PC market.

Skwire outlines the types of errors and strategies to resolve them in all details. His 40 years of experience, such as at IBM, shimmers through again and again. He puts emphasis on making sure that the reader understands the terminology he is using: “What is a problem in the first place?”, “What is a service point?” - in some cases he also explains specific metrics such as the “serviceability rating”.

His tool classification includes teaching tips, e.g. regarding the structure of a protocol in case of errors; or for tracking the important information how often an error must occur before a solution has to be approached. His suggestions equally address developers, designers, testers, managers - and the end user. In his last chapter he presents and reviews commercial tools in the first fault and second fault environment.

Skwire addresses a topic which is unfortunately very much neglected, and this alone already makes it worth enough to take a look at his book (***). Short quotations and humorous drawings relax the technical topic. If you are looking for an overview then you will be fine with this book. However, if you are a software developer looking for source code samples then you will search in vain. Skwire has released the book under the print-on-demand process. You will find it on Amazon, for example.

(Tobias Engler/fm)

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

FFSPS Book is No. 1 Microsoft OS English Book Bestseller in Germany

Friday, April 23rd, 2010

Source: Amazon DE (at the time of this writing)

 

It is also the top best seller among OpenTask titles: http://www.opentask.com/top-bestseller-april-2010

- Dmitry Vostokov @ DumpAnalysis.org + TraceAnalysis.org -

First Fault Software Problem Solving Book

Wednesday, December 9th, 2009

I’m very pleased to announce that Dan Skwire’s unique book has been published by OpenTask:

First Fault Software Problem Solving: A Guide for Engineers, Managers and Users

 

- Dmitry Vostokov @ DumpAnalysis.org -

Forthcoming Books in Q4, 2009

Thursday, September 17th, 2009

I plan the following titles to be published in Q4:

- Debugged! MZ/PE: Software Tracing, September, 2009 (ISBN: 978-1906717797)
- Windows Debugging Notebook: Essential Concepts, WinDbg Commands and Tools (ISBN: 978-1906717001)
- Memory Dump Analysis Anthology, Volume 3 (ISBN: 978-1906717438 and 978-1906717445)
- Memory Dump Analysis Anthology: Color Supplement for Volumes 1-3 (ISBN: 978-1906717698)
- First Fault Software Problem Solving: A Guide for Engineers, Managers and Users (ISBN: 978-1906717421) by Dan Skwire
- Crash Dump Analysis for System Administrators and Support Engineers (Windows Edition)  (ISBN: 978-1906717025) 

The title of the latter book was slightly changed. After some time we realized that the same material is appropriate for support engineers as well.

- Dmitry Vostokov @ DumpAnalysis.org -

The Importance of First Fault

Thursday, November 27th, 2008

I’ve been thinking through the so called First Faults after Dan Skwire, a veteran in mission-critical computer system  problem resolution, problem prevention, and system recovery, organized a group on LinkedIn for first fault problem solving activity. He also has a website:

http://www.firstfaultproblemresolution.com/ 

From my software technical support experience first fault problem resolution is very important on Windows platforms, especially in enterprise terminal service and virtualized environments where hundreds of users can be hosted on just one server. Therefore, proper tools, processes and checklists need to be set up and established for effective and efficient troubleshooting and problem resolution from both engineering and customer relationship managing perspectives. Here crash and hang dump analysis helps immensely, especially memory analysis patterns and fault databases. More on this later with specific examples. I’m also working currently on incorporating first fault problem resolution into VERSION troubleshooting steps and PARTS troubleshooting methodology.

- Dmitry Vostokov @ DumpAnalysis.org -