ТОП просматриваемых книг сайта:
Computational Statistics in Data Science. Группа авторов
Читать онлайн.Название Computational Statistics in Data Science
Год выпуска 0
isbn 9781119561088
Автор произведения Группа авторов
Жанр Математика
Издательство John Wiley & Sons Limited
101 101 Grover, L.K. (1996) A Fast Quantum Mechanical Algorithm for Database Search. Proceedings of the Twenty‐Eighth Annual ACM Symposium on Theory of Computing, pp. 212–219.
102 102 Boyer, M., Brassard, G., Høyer, P., and Tapp, A. (1998) Tight bounds on quantum searching. Fortschritte der Physik: Progress of Physics, 46, 493–505.
103 103 Jordan, S.P. (2005) Fast quantum algorithm for numerical gradient estimation. Phys. Rev. Lett., 95, 050501.
104 104 Harrow, A.W., Hassidim, A., and Lloyd, S. (2009) Quantum algorithm for linear systems of equations. Phys. Rev. Lett., 103, 150502.
105 105 Aaronson, S. (2015) Read the fine print. Nat. Phys., 11, 291–293.
106 106 COPSS (2020) Committee of Presidents of Statistical Societies, https://community.amstat.org/copss/awards/winners (accessed 31 August 2020).
107 107 Wickham, H. (2007) Reshaping data with the reshape package. J. Stat. Soft., 21, 1–20.
108 108 Wickham, H. (2011) The split‐apply‐combine strategy for data analysis. J. Stat. Soft., 40, 1–29.
109 109 Wickham, H. (2014) Tidy data. J. Stat. Soft., 59, 1–23.
110 110 Kahle, D. and Wickham, H. (2013) ggmap: spatial visualization with ggplot2. R J., 5, 144–161.
111 111 Wickham, H. (2016) ggplot2: Elegant Graphics for Data Analysis, Springer.
2 Statistical Software
Alfred G. Schissler and Alexander D. Knudson
The University of Nevada, Reno, NV, USA
This chapter discusses selected statistical software in a format that will inform users transitioning from basic applications to more advanced applications, including elaborate statistical modeling and machine learning (ML), simulation design, and big data situations. We begin with discussions on the most popular statistical software. In the course of these expositions, we provide some historical context for the computing environment, discuss the foundational principles for the development of the language (purpose), discuss user environments/workflows, and analyze strengths and shortcomings for the language (compared to other popular/notable statistical software), language support, among other software features.
Next, we briefly mention an array of software used for statistical applications. We discuss the specific purpose of each software and how the tool fills a need for data scientists. The aim here is to be fairly complete to provide a comprehensive viewpoint of the statistical software ecosystem and to leave readers with some familiarity with the most prevalent languages and software.
After the presentation of noteworthy software, we transition to describing a handful of emerging and promising statistical computing technologies. Our goal in these sections is to guide users who wish to be early adopters for a software application or readers facing a scale‐limiting aspect to their current statistical programming language. Some of the latest tools for big data statistical applications are discussed in these sections.
To orientate the reader to the discussion below, two tables are provided. Table 1 includes a list of the software described in the chapter. Throughout, we discuss user environments and workflow considerations to provide practical guidance, aiming to increase efficiency and describe typical use cases. Table 2 summarizes these environments included in the sections that follow.
1 User Development Environments
We begin by discussing user environments rather than focusing on specific statistical programming languages. The subsections below contain descriptions of some selected user development environments and related tools. This introductory material may be omitted if desired, and one can safely proceed to Section 2 for descriptions of the most popular statistical software.
Table 1 Summary of selected statistical software.
Software | Open source | Classification | Style | Notes |
---|---|---|---|---|
Python | Y | Popular | Programming | Versatile, popular |
R | Y | Popular | Programming | Academia/Industry, active community |
SAS | N | Popular | Programming | Strong historical following |
SPSS | N | Popular | GUI: menu, dialogs | Popular in scholarly work |
C++ | Y | Notable | Programming | Fast, low‐level |
Excel | N | Notable | GUI: menu, dialogs | Simple, works well for rectangular data |
GNU Octave | Y | Notable | Mixed | Open source counterpart to MATLAB |
Java | Y | Notable | Programming | Cross‐platform, portable |
JavaScript, Typescript | Y | Notable | Programming | Popular, cross‐platform |
Maple | N | Notable | Mixed | Academia, algebraic manipulation |
MATLAB | N | Notable | Mixed | Speedy, popular among engineers |
Minitab | N | Notable | GUI: menu, dialogs | Suitable for teaching and simple analysis |
SQL | Y | Notable | Programming | Necessary tool for databases |
Stata | N | Notable |
GUI: menu, dialogs
|