Talk:Applied statistics

From Citizendium
Revision as of 12:04, 16 January 2010 by imported>Boris Tsirelson (→‎Tutorials: The prosecutor's fallacy: more)
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
Tutorials [?]
 
To learn how to update the categories for this article, see here. To update categories, edit the metadata template.
 Definition the practice of collecting and interpreting numerical observations for the purpose of generating information. [d] [e]
Checklist and Archives
 Workgroup categories Engineering, Economics and Mathematics [Please add or review categories]
 Talk Archive none  English language variant British English

Help welcomed

My personal experience of this subject does not extend beyond economic statistics and quality control, so I should welcome inputs that put me right on other aspects. Also, I have had no mathematical training beyond the basic necessities of a professional engineer, so a mathematical input for the same purpose would also be welcome. By way of explanation of my drafting I should explain that I have in mind a readership of graduates and undergraduates that are neither statisticians nor mathematicians but may want to use statistics in their work or leisure. In view of the magnitude of the subject, I am trying to touch briefly upon most aspects, but long enough only to convey its flavour and provide links to authoritative references. I think that my cautionary notes in the text are justified by the damage done by the misuse of statistics, especially by my mathematical fellow-economists and their dramatic misapplication of statistics to financial risk assessment. Nick Gardner 14:46, 28 June 2009 (UTC)

My thanks for help with definitions - and I look forward to further initiatives from mathematicians in creating further definitions. I would point out, however, that a definition that contains undefined terms is unhelpful. The current defimition of "sample" is a case in point. In my opinion it cannot be allowed to stand. Nick Gardner 06:00, 30 June 2009 (UTC)

I should like to add a note to the glossary title on the Related Articles subpage to the effect that for mathematically precise explanations of the concepts, the reader should refer to the Statistics theory article. This does not seem helpful in the present state of that article, but perhaps I should do so in anticipation of its further development? Nick Gardner 10:53, 30 June 2009 (UTC)

Using "Related Articles" as a glossary works quite well for some terms, but not for all. There are some terms which never will get a page under this title, e.g. (Mean which needs disambiguation, or others for which the explanation is too long, or only suitable in a certain context.
(And a definition probably should not contain a displayed formula?) Peter Schmitt 12:45, 1 July 2009 (UTC)
There are lots of other glossary definitions that will never have their own pages. Is there any harm in that? Point taken about the formula. I'll delete it. Nick Gardner 17:20, 1 July 2009 (UTC)

Category

I don't understand the choice "Library and information science". Shouldn't it be Mathematics, Sociology, Health Sciences, Geography, ... Peter Schmitt 12:54, 1 July 2009 (UTC)

I didn't know where to put it. There seems to be no obvious place for decision theory, or information technology which I am inclined to think are near neighbours. I did not want, for reasons that I have explained, to give the impression that it should be treated as a branch of mathematics. It is true that it uses a lot of mathematical theorems, but so does engineering - perhaps I should put it there? I think I will add it there, but I am open to suggestions. Nick Gardner 17:10, 1 July 2009 (UTC)

Definitions / Glossary

Nick, your extended definition

  • Confidence interval [r]: the range of a random variable, such as the mean of a sample, that — with a specified probability — contains the true value for the population. [e]

shows the problem of combining it with a glossary: It certainly is suitable in the context of the article. However, as a definition it has to stand independently of the article — and then "population" is not correct. Or would you use "population" when you calculate the confidence interval of the measurements of a distance? Peter Schmitt 20:22, 1 July 2009 (UTC)

Thanks. I don't see why not, but please make whatever addition or qualification that you consider appropriate. Nick Gardner 06:24, 2 July 2009 (UTC)

Another point: The guidelines (which appear on new Definition pages) asks to start with a capital letter. (I probably would also use lower case. However, I understand that it is necessary to keep a uniform appearance.) Peter Schmitt 20:29, 1 July 2009 (UTC)

Oh dear! You have spotted a rule that I have broken several hundred times in the course of the last year! I think I will leave to someone else to go back over my work and put it all right. I might be running out of time!Nick Gardner 06:24, 2 July 2009 (UTC)
Thanks for your comments, Peter. You have concvinced me that the definitions were more trouble than they were worth, and I have deleted them - as far as I can. I hope somone can tidy the matter up by deleting them fully or putting them right. Nick Gardner 16:05, 2 July 2009 (UTC)

Ready?

This article is now complete to the extent that I have been able to make it so (although I have no doubt that it could be augmented by contributions from professional statisticians). Nick Gardner 11:51, 15 January 2010 (UTC)

Tutorials: The prosecutor's fallacy

I think, it is less simple; several scenarios should be treated separately.

Scenario A: the investigator have, or is able to obtain, DNA of all half a million people. Then indeed it may happen that he just took the first mathching one, and then indeed the probability of error is close to 1.

Scenario B: the investigator is able to check only one (well, maybe two or three) man for DNA. Here we have two sub-scenarios.

B1: The investigator, when seeing a man, is able to guess before the test, whether his DNA matches or not. (Quite fantastic assumption, isn't it?) Then maybe he, seeing a lot of men, choses one that should fit, gets the positive result of the test, — and indeed, it is false positive with probabilty close to 1 (like in scenario A).

(To be continued soon.) Boris Tsirelson 17:58, 16 January 2010 (UTC)