| Store | Cart

COCOMO - appropriate for languages like Python

From: Mike Brenner <mik...@mitre.org>
Fri, 12 Jul 2002 14:31:55 -0400
Mike > ... is it possibly a bug in Internet Explorer or comp.lang.python? ...

Andreae > ... None of the above ... see what your post looks like ... google ...

http://groups.google.com/groups?dq=&hl=en&lr=&ie=UTF-8&safe=off&selm=mailman.1026318496.22583.python-list%40python.org

Thank you very much for this reference. I reported this terrible display bug to GOOGLE. Here is the post with hard carriage returns.

--------------------------------

[REPOSTED WITH HARD CARRIAGE RETURNS]


> From a management standpoint ... strive to hire the very best people > and to break projects into moderate sized, easily handled chunks.> From a schedule estimation standpoint, LOC > (however you choose to count them) appears to be a pretty good > estimator to use for a fixed staff and fixed project sizes. > How the prediction varies as you change staff or project sizes is> something you'll have to measure or guess at yourself.


IMO, the following measurements apply to most Python efforts 
more than lines of code:

        - Most software has achieved the status of "maintenance" 
rather than "development". Thus, millions of lines of code might 
require, say, a one-line change. Some of those one-line changes 
take a month of research and testing, while others take a few seconds. 
The lines of code changed (and the lines of code in the whole project) 
only correlate to new code, not to code under maintenance.

        - Software Maintenance time primarily increases as backlog 
increases (e.g. programmer overtime which the company intends to not 
pay, other work awaiting action, inadequate functional testing, 
deferred regression testing which detects reappearance of prior bugs, 
incomplete impact analysis of past and present changes, 
incorrect documentation, and the age of parallel development
paths). For example, parallel development paths that last more 
than a few days (fractured baselines) start to take on lives 
of their own, and become more expensive to merge into a single 
"golden" baseline as they age.  

        - Software Maintenance time decreases as the technology 
factor (the part of the COCOMO model that applies to software 
maintenance rather than to new development) increase. Thus, 
to save time, get better tools for: comparison, visualization,
cross-referencing, text pattern search, automatic testing, slicing, 
and other tools to assist in determining the impact of changes. 
Also, use an interpreted language like Python with good debugging, 
tracing, and other tools. Measure the number of seconds it takes 
to find the line of code that caused a problem (something that 
gets harder when you add wxWindows, threads, servlets, graphics, 
COM objects, operating system calls, web services, soaps,
and additional levels of interpretation to the software). 
        
        - Software Maintenance time increases with the number 
of data integrity problems possible in the languages and tools 
used (coding bugs, data design flaws, data flaws, network design 
flaws, rifts, and sabotage). Coding bugs include off-by-one errors, 
aliasing, global variables, hanging pointers, out-of-range errors, 
missing actions, extra actions, global typing violations (even in 
interpreted languages), and memory leakage. Data design flaws include
failure to trigger an action (e.g. busting a referential 
integrity constraint), implementing a many-to-many relationship 
inefficiently, mismatching keys, inappropriate level of normalization,
performing a wrong action, or incomplete design. Data flaws include 
conversion round-offs or other loss of precision, incorrect origin 
or units of measure, inaccurate numbers, inappropriate null 
data, obsolete data, out of range errors, two-digit years, and 
local type mismatches. Network design flaws include bandwidth 
too low (errors or noise), bottlenecks, inadequate error detection,
notification, correction, speed mismatch, missing packets, 
power failures, race conditions such as deadlocks, 
insufficient redundancy, loss of resolution, simultaneous modifications, 
and slow response time. 
Rifts are another name for the backlogs above which were called 
out separately because of their large effect on time. Sabotage 
(tampering with the data, trojan horses, viruses, authentication 
where none is needed, weak authentication where authentication 
is actually needed, worms, failure of CM to preserve the parts, 
management destroying service requests which were completed or 
cancelling to remove the evidence, and lack of backups). 

        - Software Maintenance time positively correlates to 
the amount of time that maintenance organization last maintained 
that type of code (after subtracting the time it takes to open up 
the configuration, which relates to how long since a project 
last opened that code).

        - Software Maintenance time negatively correlates with 
syntactic standards. For example, a rule like "indent each level 
3 spaces" slows down software maintenance and development in 
two ways. First, humans (like programmers, reviewers, and 
quality assurance auditors) will tend to spend time enforcing and 
carrying out such a rule, because of their ego-needs for 
control; people should only impose such a rule by providing 
a tool to enforce the rule automatically. Second, such rules 
limit the visualization possibilities -- for example, someone 
might discover a bug more quickly by viewing the code 
with several different indentations.

        - When management decides to use a metric, and the 
programmers become aware of that metric, then the programmers 
take whatever action required to make that metric reach 100%. 
For example, if management pays the programmers by the line 
of code, the lines of code will increase.

        - When management (personal, government, corporate, 
academic) requests non-applicable metrics like lines of code 
(a metric which makes sense only at development time) or 
McCabe Cyclomatic Complexity (a metric which makes sense only 
at testing time since it counts test paths and dings programmers 
for good stuff like nested SWITCH/CASE statements and BREAKs) 
to describe maintenance effort, consider get rid of that entire level of management.

        - Without an accepted standard definition of 
"lines of code", one cannot know whether to count every line 
in every IMPORTed module, or just those the INVOKED lines, 
or just those lines that do the invoking. For example, do 
we count every line in the python runtime module STRING.PY or 
just the lines in STRING.PY that our modules call, or just 
the lines in our module that call STRING.PY? These different 
counting strategies differ by orders of magnitude. 

Mike Brenner

Recent Messages in this Thread
Mike Brenner Jul 12, 2002 06:31 pm