Скачать книгу

(as you might expect) represents a danger of one sort or another.

      

Whenever you see this icon, think advanced tip or technique. You might find these tidbits of useful information just too boring for words, or they could contain the solution you need to get a program running. Skip these bits of information whenever you like.

If you get nothing else out of a particular chapter or section, remember the material marked by this icon. This text usually contains an essential process or a bit of information you ought to remember.

      

Paragraphs marked with this icon reference the sample files for the book.

      In addition to the book you have in your hands, you can access some extra content online. Check out the free Cheat Sheet for lists of Power Query text functions and Power Query date functions that are good to know. Just go to www.dummies.com and type Excel Power Pivot & Power Query For Dummies Cheat Sheet in the Search box.

      If you want to follow along with the examples in this book, you can download the sample files at www.dummies.com/go/excelpowerpivotpowerqueryfd2e. The files are organized by chapter.

      It’s time to start your self-service BI adventure! If you’re primarily interested in Power Pivot, start with Chapter 1. If you want to dive right into Power Query, jump to Part 2, which begins at Chapter 8.

      Supercharged Reporting with Power Pivot

      Think about data like a relational database.

      Create your own Power Pivot data model.

      Explore the workings of pivot tables.

      Use external data with Power Pivot.

      Manage the Power Pivot internal data model.

      Create your own formulas in Power Pivot.

      Delve deeper into the DAX formula language.

      Thinking Like a Database

      IN THIS CHAPTER

      

Examining traditional Excel limitations

      

Keeping up with database terminology

      

Looking into relationships

      With the introduction of business intelligence (BI) tools such as Power Pivot and Power Query, it’s becoming increasingly important for Excel analysts to understand core database principles. Unlike traditional Excel concepts, where the approach to developing solutions is relatively intuitive, you need to have a basic understanding of database terminology and architecture in order to get the most benefit from Power Pivot and Power Query. This chapter introduces you to a handful of fundamental concepts that you should know before taking on the rest of this book.

      Years of consulting experience have brought this humble author face to face with managers, accountants, and analysts who all have had to accept this simple fact: Their analytical needs had outgrown Excel. They all faced fundamental challenges that stemmed from one or more of Excel’s three problem areas: scalability, transparency of analytical processes, and separation of data and presentation.

      Scalability

      Scalability is the ability of an application to develop flexibly to meet growth and complexity requirements. In the context of this chapter, scalability refers to Excel’s ability to handle ever-increasing volumes of data.

      Imagine that you’re working in a small company and using Excel to analyze its daily transactions. As time goes on, you build a robust process complete with all the formulas, pivot tables, and macros you need in order to analyze the data that is stored in your neatly maintained worksheet.

      As the amount of data grows, you will first notice performance issues. The spreadsheet will become slow to load and then slow to calculate. Why does this happen? It has to do with the way Excel handles memory. When an Excel file is loaded, the entire file is loaded into RAM. Excel does this to allow for quick data processing and access. The drawback to this behavior is that every time the data in your spreadsheet changes, Excel has to reload the entire document into RAM. The net result in a large spreadsheet is that it takes a great deal of RAM to process even the smallest change. Eventually, every action you take in the gigantic worksheet is preceded by an excruciating wait.

      Your pivot tables will require bigger pivot caches, almost doubling the Excel workbook’s file size. Eventually, the workbook will become too big to distribute easily. You may even consider breaking down the workbook into smaller workbooks (possibly one for each region). This causes you to duplicate your work.

      In time, you may eventually reach the 1,048,576-row limit of the worksheet. What happens then? Do you start a new worksheet? How do you analyze two datasets on two different worksheets as one entity? Are your formulas still good? Will you have to write new macros?

      These are all issues that need to be addressed.

      Of course, you will also encounter the Excel power customers, who will find various clever ways to work around these limitations. In the end, though, these methods will always be simply workarounds. Eventually, even these power customers will begin to think less about the most effective way to perform and present analysis of their data and more about how to make data “fit” into Excel without breaking their formulas and functions. Excel is flexible enough that a proficient customer can make most things fit just fine. However, when customers think only in terms of Excel, they’re undoubtedly limiting themselves, albeit in an incredibly functional way.

      A relational database system (such as Access or SQL Server) is a logical next step for the analyst who faces an ever-increasing data pool. Database systems don’t usually have performance implications with large amounts of stored data, and are built to address large volumes of data. An analyst

Скачать книгу