Free Essay

Spanner, Google Data Base

In: Computers and Technology

Submitted By rambhatta
Words 2222
Pages 9
“ Spanner : Google’s Globally – Distributed Database “

Spanner is a NewSQL created by google. It is a distributed relational database that can distribute and store data in google’s big table storage system in multiple data centers. Spanner is Google’s scalable, multi-version, globally distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. Spanner provides the scalability that enables you to store a few trillion database rows in millions of nodes distributed to hundreds of data centers. When you read data, spanner connects you to data center that is geographically closest and similarly when we write the data, it distributes and stores it to multiple data centers. If in case the data center we try to access has a failure we can read the data from another data center that has a replica of the data.

Replication is used for global availability and geographic locality clients automatically failover between replicas. Applications can use spanner for high availability even in the face of wide-area natural disasters, by replicating their data within or even across continents. Spanner’s main focus is managing cross-datacenter. Many applications at google have chosen to use Megastore because of its data model and support for synchronous replication despite poor throughput. Applications can specify con-straints to control which datacenters contain which data, how far data is from its users how far replicas are from each other an how many replicas are maintained can also be known.

Spanner has two features that are difficult to implement i.e providing externally consistent reads and writes, and globally-consistent reads across the database at a time-stamp. If a transaction t1 commits before another transaction t2 starts, then t1’s commit timestamp is smaller than t2’s. Spanner is the first system to provide this guarantees at global scale.A spanner deployment is called a universe. Spanner is organized as a set of zones, where each zone is the rough analog of a deployment of bigtable. Zones can be added to or removed from running system as new datacenters when brought into service and old ones are being turned off. As a globally-distributed database, Spanner provides several interesting features. First, the replication configurations for data can be dynamically controlled at a fine grain by applications. Applications can specify constraints to control which datacenters contain which data, how far data is from its users (to control read latency), how far replicas are from each other (to control write latency), and how many replicas are maintained (to control durability, availability, and read performance). Data can also be dynamically and transparently moved between datacenters by the system to balance resource usageacross datacenters.

These components are empowered by the way that Spanner allocates comprehensively important confer timestamps to exchanges, despite the fact that exchanges may be dispersed. The timestamps reflect serialization request. Likewise, the serialization request fulfills outer consistency (or proportionately, linearizability [20]): if an exchange T1 confers before another exchange T2 begins, then T1's confer timestamp is littler than T2's. Spanner is the to start with framework to give such ensures at worldwide scale. The key empowering influence of these properties is another TrueTime Programming interface and its usage. The API specifically uncovered clock vulnerability, and the sureties on Spanner's timestamps rely on upon the limits that the usage gives. On the off chance that the vulnerability is huge, Spanner backs off to endure that vulnerability. Google's bunch administration programming gives a usage of the TrueTime Programming interface. This usage keeps vulnerability little (for the most part under 10ms) by utilizing different cutting edge clock references (GPS and nuclear timekeepers).

The replication and distributed transactions have been layered onto big-table based implementation. This can viewed with the help of spanserver software stack, here in each spanserver is responsible for 100 to 1000 instance of a data structure called a tablet. Here spanner assigns timestamps to data since there are multiple versions of data. Spanner in order to support replication, each spanserver implements a single Paxos state machine on top of each tablet. The current spanner implementation logs every paxos write twice, basically first time in tablet’s log and later in the paxos log. The spanserver implements lock table in order to obtain the concurrency control.

A registry is likewise the littlest unit whose geographic replication properties (or arrangement, for short) can be indicated by an application. The outline of our arrangement determination dialect isolates obligations for overseeing replication setups. Overseers control two measurements: the number and sorts of imitations, and the geographic arrangement of those copies. They make a menu of named choices in these two measurements (e.g., North America, repeated 5 routes with 1 witness). An application controls how information is repeated, by labeling every database and/or singular registries with a mix of those choices. For instance, an application may store every end-client's information in its own index, which would empower client An's information to have three copies in Europe, and client B's information to have five imitations in North America.

Spanner implementation supports a bucketing abstraction called a directory, which is a set of contiguous keys that share a common prefix. Supporting these directories allows applications to to control the locality of their data. Directories movement is enabled even when clients operations are going on, and speed of the data movement could be fifty mega bytes in few seconds. Directory is also smallest unit whose properties could be specified by an application. Here in the directory the administrators take care of two dimensions : the number and types of replicas and the geographical placement of it.

Spanner exposes some set of data features to applications. Data model is layered on top of the directory-bucketed key-value mappings supported by implementation. Each database contains many number of tables. And tables consists of rows, columns, and values in it. Spanner’s datamodel contains row names and primary key is allotted to it. And as said previously rows contains some values for their existence. CREATE TABLE Users { uid INT64 NOT NULL, email STRING
} PRIMARY KEY (uid), DIRECTORY;

The above given querry is an example for spanner schema.

TT.now() method returns a TTinterval that is guaranteed to contain the absolute time. Similarly TT.after(t) returns true if t was definitely passed and finally TT.before(t) returns true if t has definiteky not arrived. Here The TT.after() and TT.before() methods are said to be convenience wrappers around TT.now(). The time references used by the True Time are atomic clocks. Atomic clocks can fail over long periods of time can drift significantly due to frequency error. True Time is implemented by a set of time master machines.The majority of the masters posses GPS receivers with antennas. The remaining masters are equipped with atomic clocks.

LastTS() as characterized above has a comparable shortcoming: if an exchange has quite recently dedicated, a non-clashing readonly exchange must at present be doled out sread in order to take after that exchange. Therefore, the execution of the read could be postponed. This shortcoming can be cured comparably by increasing LastTS() with a fine-grained mapping from key reaches to submit timestamps in the lock table. (We have not yet actualized this improvement.) At the point when a read-just exchange arrives, its timestamp can be doled out by taking the most extreme estimation of LastTS() for the key reaches with which the exchange clashes, unless there is a clashing arranged exchange.

The spanner implementation supports read-write transactions and read-only transactions. Standalone write are implemented as read-write transactions. Non-standalone reads are implemented as read-only transaction. For both read-only transactions and snapshots reads, commit is inevitable since time stamp was chosen, unless data at that timestamp was collected. When any server fails, clients can continue their process of querry internally on different server by choosing or repeating the timestamp and also the current read position. Transactional reads and writes use two-phase locking. When all locks have acquired prior release of any lock, for given transaction, spanner assigns timestamp to paxos that represents transaction commit. A single reader assigns timestamp monotonically in increasing order. And whenever a timestamp is assigned Smax is advanced to S.here each paxos state machine has a safe time Ttmsafe and Tpaxossafe is simpler, writes will no longer occur at or below with respect to paxos. Tsafe = min (Tpaxossafe , Ttmsafe).

tPaxos sheltered as characterized above has a shortcoming in that it can't advance without Paxos composes. That is, a preview perused at t can't execute at Paxos bunches whose last compose happened before t. Spanner addresses this issue by exploiting the disjointness of pioneer lease interims. Each Paxos pioneer propels tPaxos safe by keeping an edge above which future composes' timestamps will happen: it keeps up a mapping MinNextTS(n) from Paxos succession number n to the base timestamp that may be relegated to Paxos grouping number n + 1. A copy can progress tPaxos

In read-write transaction, mwrites that occur in a transactions that comes under the client, as a result reads in transaction don’t see the effects of write in the transactions. Reads within read-write are bound to avoid deadlocks. When client has completed all reads and buffered all writes, it begins two-phases commit message to each participant;s leader writes. Spanner requires a scope expression for every read-only transaction, which is an expression that summarizes key thet will read entire transaction. A spanner schema-change transaction is generally non-blocking variant of a standard transaction. As a result, schema changes across thousands of servers with minimal disruption to other concurrent activity. With true time, defining schema change to happen at t would be meaningless. Information can be stored in the lock table, which already maps key ranges to be clocked against fine-grained safe time for key ranges with read conflicts. The execution of the read could be delayed. When a read-only transaction arrives, timestamp can be assigned taking maximum value of LastTS() for the key ranges unless there is a conflicting prepared transaction.

The spanner’s Performance is measures with respect to replication, transactions and availability. And also we present some benchmarks for spanner. Where some measurements were taken on machines, each spanserver was run on 4GB ram and cores AMD 2200MGHZ. The information demonstrates that these two components in deciding the base estimation of are for the most part not an issue. Be that as it may, there can be huge tail-dormancy issues that cause higher estimations of . The lessening in tail latencies starting on March 30 were because of systems administration changes that lessened transient system join blockage. The increment in on April 13, around one hour in term, come about because of the shutdown of 2 time experts at a datacenter for routine upkeep. We keep on examining also, evacuate reasons for TrueTime spikes.

Clients and zones were placed in a set of datacenters with network distance of less than 1ms.Operations were stand alone reads and writes of 4KB. For latency operations and throughput operations on cpu’s. Single-read read-only transaction only execute at leaders because timestamp assignment must happen at leaders. Read-only transaction throughput increase with number of replicas, were randomly distributed among the zones. Two questions were answered with respect to True-time. Our machine statistics show that bad cpu’s are 6 times more likey than bad clocks. Those are lock issues, relative to much more serious problems. Spanner started being experimentally evaluated during 2011, under google’s advertising backend called F1. This was originally based on MYSQL databse that was manually shared many ways. The last resharding to this spanner took over 2 years of time. The F1 team chose to use spanner for several reasons. There were several steps which need to carried out First, spanner removes the need to manually re-shared, second, spanner provides replication, failover was difficult. The F1 team also needed secondary indexes on their data and was bale to implement their own consistant global ideas using spanner related transactions.

In conclusion, Spanner uses ideas from two communities database community and semi-relational interface transaction, an SQL- based query language from systems was used. Spanner shouldalso focus on databse features that bigtable was missing. And an another aspect of the spanner design stands out is that it is possible to build the distributed systems with much more stronger semantics and when tighter bounds stronger semantics decreases. As a community we should no longer depend on loosely synchronized clocks and weak api’s in designing distributed algorithms.

There has constant work going on based on this spanner with F1 team to transition google’s advertising backend from MYSQL to spanner, now there is current process trying to improve functionality and performance of backup. Given that we expect many applications to replicate datacenters Truetime may notice performance Timemaster querry intervals can be reduced and better clock crystals or possibly even avoided through alternate time-distribution technology.Finally there were some areas for improvement although spanner is scalable in number of nodes algorithms and datastructures from database could improve single node and also moving data automatically between datacenters in response to changes in client load but to make effective, we would also need to coordinated fashion.…...

Similar Documents

Premium Essay

Data Base

...understanding of how SQL works. •Lots of examples with SQL using MySQL that make understanding the process of contructing SQL queries easy and also using MySQL and to illustrate the mechanism of storing and retrieving information from databases. (If you find any errors, please send me an email and I'll promptly set it right. If there's something you'd like to see here let me know too.) -------------------------------------------------------------------------------- What's a database ? A database is a collection of data organized in a particular way. Databases can be of many types such as Flat File Databases, Relational Databases, Distributed Databases etc. -------------------------------------------------------------------------------- What's SQL ? In 1971, IBM researchers created a simple non-procedural language called Structured English Query Language. or SEQUEL. This was based on Dr. Edgar F. (Ted) Codd's design of a relational model for data storage where he described a universal programming language for accessing databases. In the late 80's ANSI and ISO (these are two organizations dealing with standards for a wide variety of things) came out with a standardized version called Structured Query Language or SQL. SQL is prounced as 'Sequel'. There have been several versions of SQL and the latest one is SQL-99. Though SQL-92 is the current universally adopted standard. SQL is the language used to query all databases. It's simple to learn and......

Words: 6659 - Pages: 27

Premium Essay

Good Data Base

...1. Poor information is the quickest way to ruin a good database. The case says that data visualization is used for contextualizing the data. Companies will use data visualization to give the date they are showing emotional impact. One place where data visualization is currently being used to is monitor what is being talked about on social media. Digg is a populat website right now delivers the most talked about new stores. You are able to read news stories and shoare them at your computer, on your phone, or tablet. You also get daily emails with popular stories. Stack is a program that creates a data visualization of the stories that are on Digg. Stories drop into a graph and as the topic gets more hits, the bar on the graph grows. Below is an picture of Stack. All Stack is showing is what topics are being looked at, and the amount of times a topic is being accessed. This give the viewer of this data visualization information about what people are interested in currently. This is helpful data that is easy to read. If the devlopers of this program would of put poor inofmation, or unneeded information into this data visualization, it would have become much more confusing, and it would no longer be a beneficial to the viewer. This data visualization is quick and easy to view and come to a conclusion on. Poor data just confuses the person who is using the visualization to gain information. At my last job my bosses loved seeing an overview of the different......

Words: 553 - Pages: 3

Premium Essay

Data Base

...Phoenix Material Determining Databases and Data Communications Read Scenario 1 and Scenario 2 below. Write a paper of no more than 1,500 words in which you respond to the questions designated for both scenarios. Scenario 1: You are a marketing assistant for a consumer electronics company and are in charge of setting up your company’s booth at trade shows. Weeks before a show, you meet with the marketing managers and determine what displays and equipment they want to display. Then, you identify each of the components that need to be shipped and schedule a shipper to deliver them to the trade show site. You then supervise trade show personnel as they set up the booths and equipment. After the show, you supervise packing the booth and all the equipment as well as schedule its shipment back to your home office. When the equipment arrives, you check it into your warehouse to ensure that all pieces of the booth and all the equipment are returned. If there are any problems due to shipping damage or loss, you handle those problems. Your job is important; at a typical show you are responsible for more than a quarter-million dollars of equipment. • In Scenario 1: o You need to track data about booth components, equipment, shippers, and shipment. List typical fields for each type of data. Provide an example of two relationships that you need to track. o Do you need a database system? If not, can Excel® handle the data and the output? What are the......

Words: 436 - Pages: 2

Premium Essay

Data Base Concepts

...implementation of the database. Lookup entity: An entity used to store lookup values such as state names or zip codes. Maximum cardinality: The maximum number of instances of one entity that may be associated with each instance of another entity. Minimum cardinality: The minimum number of instances of one entity that may be associated with each instance of another entity. Naming conventions: Conventions for naming database objects in order to maintain consistency and readability. Physical design: The design of the database within a particular DBMS. The physical design takes account of file systems and disk locations as well as DBMS-specific data types. Surrogate keys: A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data. Triggers: A trigger is a collection of SQL commands that are executed when a database event occurs such as an INSERT, UPDATE, DELETE. Weak entities: In a relational database, a weak entity is an entity that cannot be uniquely identified by its attributes alone; therefore, it must use a foreign key in conjunction with its attributes to create a primary key. The foreign key is typically a primary key of an entity it is related to. Chapter 5 Normal forms: a defined standard structure for relational databases in which a relation may not be nested within another relation. Update anomalies: Database normalization is the process......

Words: 1477 - Pages: 6

Premium Essay

Google Data Efficiency

... and why is it an important place to start when considering how to reduce data center power consumption? What value of PUE should data center managers strive for? Power Usage Effectiveness (PUE) is a measure to determine the energy efficiency of a data center. It is the ratio between the energy of the system and the energy of the IT equipment in the data center.               TOTAL FACILITY ENERGY PUE =   ---------------------------------------               IT EQUIPMENT ENERGY It is the ratio of the energy supplied to the entire Data Center and the energy consumed by the IT equipment. Whether to feed the entire plant, we need twice the minimum required to own IT equipment, we have a PUE 2. It allows to calculate the efficiency of the power supply and cooling IT equipment.. A PUE of 2.0 means that for every watt of IT power, an additional watt is consumed to cool and distributes power to the IT equipment. A PUE of 1.0 is considered ideal. A closer PUE of 1.0 means that almost all the energy is used to compute. For use PUE to be effective, should be measured often. If you occasionally makes the information will not be realistic. Ideally, measured as quickly and frequently as every second way. The higher the frequency, the more significant the results. 2. Describe the five methods recommended by Google for reducing power consumption. • Measure Pue To manage the efficiency of a data center must have the tools to measure the efficiency of energy use or......

Words: 324 - Pages: 2

Premium Essay

Data Base

........................................................................................................................... 2 INTRODUCTION ................................................................................................................................... 2 DATABASE MANAGEMENT SYSTEMS .................................................................................................. 2 Database ......................................................................................................................................... 2 Database Management System (DBMS) ......................................................................................... 2 Schemas, Instances and Data Independence.................................................................................. 3 DATA MODELS..................................................................................................................................... 3 Hierarchical Model .......................................................................................................................... 3 Network Model ............................................................................................................................... 4 Relational Model ............................................................................................................................. 5 CHAPTER 2 ..................................................................................................................

Words: 4347 - Pages: 18

Premium Essay

Data Base Mgmnt

...1. How important are accurate data for on-line businesses? Accuracy of data is very important for on-line businesses because if the data is at risk it is difficult to manage or handle .The data should be accurate, complete, easily accessible, consistent and relevant as all these characteristics of data promote revenues for the businesses. For example in the case of Media Tech Direct Works it uses the customer contact information i.e. addresses, phone numbers and e-mail addresses to establish points of contacts with its customers and these links has provide sales opportunities and profits for the company. 2. Is technology sufficient to guarantee that data errors will not occur? If not, then what other factors should a business need to consider? Technology helps the organizations to manage their data but it is not sufficient to guarantee that the data errors will not occur because the data degrades over the period of time. In the case of Media Tech they faced the same problem, the customer data was degrading as it was incomplete or obsolete. The factors that a business need to consider is data –scanning and data –matching, with this system the company would be able to identify redundant, fragmented and incorrect data and clean them from the data warehouse. 3. Provide examples of cost savings that can be achieved by reducing data errors? An example of cost saving that can be achieved by reducing data errors is that a business can save more than $250,000 annually by...

Words: 262 - Pages: 2

Free Essay

Data Base Inthe Wotk Place

...companies have networks composed of servers capable of storing millions of bits of data on their systems. Companies must be able to understand database fundamentals when they decide what database systems they need. When a company obtains data, it must be organized in a readable format so that company personnel can access it from their computers, as it is needed. Databases have become very common in the workplace today. Many organizations use databases to keep track of payroll, vacation time, supply inventories, and other information and for many other tasks too numerous to include in this abstract. Business use database anytime they have large amounts of data they need to search and categorized, so that it can be access later for other uses. In designing and determining the purpose of a database there are two key principles, which need to be considered. The first is duplication of information and the second is the correctness or quality of that information. Duplicate information must be minimized or avoided because it wastes space, increases errors and creates inconsistencies. The second principle, correctness, involves making sure data is entered correctly otherwise any queries and reports generated from the information will also contain incorrect information. Therefore, to ensure their database is good the company must make sure that the data is broken into subject-based tables so that duplicate data can be minimized. They also must make sure that the correct keys......

Words: 821 - Pages: 4

Free Essay

Data Base Esign

...Stephen Favor (CPD121) Assignment Lab 2-3 TR10 1. What is an entity? * An entity is something about which we want to keep data. There are physical entities; something physical in our universe (e.g., a person, place, thing, etc.). There is also a logical entity; something non-physical (e.g., a relationship).Three examples of an entity could be a gun, Airline, and a camera. 2. What is a relationship? * A relationship is a logical connection between records from two or more tables. All relationships can be categorized as one-to-one, one-to-many, or many-to-many. An example of a one-to-many relationship would be a pet store owner and the pets. The Pet Store owner (parent) represents the one, and the pets (children), represent the many. That is given the assumption that the store is owned by a single entity. The Pet Store can have many pets, but the pets can’t belong to many stores. * An example of a many-to-many is brand names-to-cookies. There are many brand names and there are many types of cookies. A brand can make many types of cookies and a cookie type can be made by many different brands. 3. What is an attribute? * An attribute is a property or characteristic of an entity. Three entities and their attributes are (A) gun (make, model), (B) airline (name, model), and (C) A camera (make, type). Part Two 3. What column or columns should be selected as the primary key to best meet the desired properties for a primary key in exhibit 2-17? * I......

Words: 483 - Pages: 2

Premium Essay

Data Base Design

...such as serious embarrassment or inconvenience, identity fraud, lives may even be put at risk. Even if privacy is not in itself a fundamental right, it is necessary to protect other fundamental rights. 3) What role does information security play in this scenario? Information Security refers to the processes and methodologies which are designed and implemented to protect print, electronic, or any other form of confidential, private and sensitive information or data from unauthorized access, use, misuse, disclosure, destruction, modification, or disruption. For the scenario , in the above video I feel that different Information security tools must be used by the company to safeguard the customers personal information. Some of the tools that can be used are: * Campus border firewall :A system designed to prevent unauthorized access to or from a private network. * Encryption. Encryption converts data into a secure form that can be safely used. * Establish proper Guidelines and policies for the use of the data....

Words: 354 - Pages: 2

Free Essay

Data Bases (Spanish

...Fundamentos Base de Datos 1. Aplicaciones de los sistemas de datos: Las bases de datos son muy usadas y las mas significativas son: • Banca • Líneas aéreas • Universidades • Producción • Recursos humanos • Telecomunicaciones • Tarjetas de crédito • Finanzas • Ventas El uso de base de datos en estas áreas es esencial y hoy la mayoría de las empresas tienen base de datos. Desde los inicios del internet, una de las cosas mas importantes es tener bases de datos porqué te reduce trabajo y hace mejor las cosas, como por ejemplo las librerías, las consultas de estados de cuenta en un banco, etc. 2. Sistemas de Base de Datos Frente a Sistemas de Archivos Para poder cambiar y modificar la información, el sistema debe tener ciertas aplicaciones de un sistema operativo convencional que permite crear nuevos archivos o modificarlos, y para este proceso de almacenamiento se necesita mantener la información en un sistema de procesamiento. Los inconvenientes son : • Redundancia e inconsistencia de datos • Dificultad de acceso a los datos • Anomalías en el acceso recurrentes • Aislamiento de datos • Problemas de integridad (restricciones de consistencia) • Problemas de atomicidad (sujetas a fallo) • Problemas de seguridad 3. Visión de los Datos Un sistema de bases de datos es una colección de archivos interrelacionados y un conjunto de programas que permitan a los usuarios......

Words: 1102 - Pages: 5

Free Essay

Data Base

...Fundamentos Base de Datos 1. Aplicaciones de los sistemas de datos: Las bases de datos son muy usadas y las mas significativas son: • Banca • Líneas aéreas • Universidades • Producción • Recursos humanos • Telecomunicaciones • Tarjetas de crédito • Finanzas • Ventas El uso de base de datos en estas áreas es esencial y hoy la mayoría de las empresas tienen base de datos. Desde los inicios del internet, una de las cosas mas importantes es tener bases de datos porqué te reduce trabajo y hace mejor las cosas, como por ejemplo las librerías, las consultas de estados de cuenta en un banco, etc. 2. Sistemas de Base de Datos Frente a Sistemas de Archivos Para poder cambiar y modificar la información, el sistema debe tener ciertas aplicaciones de un sistema operativo convencional que permite crear nuevos archivos o modificarlos, y para este proceso de almacenamiento se necesita mantener la información en un sistema de procesamiento. Los inconvenientes son : • Redundancia e inconsistencia de datos • Dificultad de acceso a los datos • Anomalías en el acceso recurrentes • Aislamiento de datos • Problemas de integridad (restricciones de consistencia) • Problemas de atomicidad (sujetas a fallo) • Problemas de seguridad 3. Visión de los Datos Un sistema de bases de datos es una colección de archivos interrelacionados y un conjunto de programas que permitan a los usuarios......

Words: 1102 - Pages: 5

Free Essay

Data Base

...Metadata CustTable Central client register table. Statistic and setup values are defined in this table. | Field name | Data type | Field description | AccountNum | NUMBER(5,0) | Unique identifier of a customer | Name | VARCHAR2(50) | Name of the customer. This name is printed on documents, such as invoices and account statements. | InvoiceAccount | NUMBER(5,0) | Invoice account, to which the customer is linked. The invoice account is the account to which the invoice amount is debited. In the event there is not a specific invoice account, the customer’s account is used in this field. | CrediMax | NUMBER(8,0) | Maximum amount that the selected customer is allowed to have as an outstanding account balance.This amount is always stated in the default currency of the customer. | CountryRegionId | INTEGER | Country/region of the customer address. | Address1 | VARCHAR2(100) | First part of street or postal address line | Address2 | VARCHAR2(100) | Second part of street or postal address line | City | VARCHAR2(50) | City of the customer address | ZIpCode | VARCHAR2(5) | Postal/ZIP code of the customer address. This code is often used for searching and sorting. | State | VARCHAR2(2) | State of the customer address. This field is often used for searching and sorting. | Telephone | VARCHAR2(20) | Customer’s contact telephone number | PaymMode_Typical | INTEGER | Stored, typical payment mode of the customer. Related to PaymMode table | LineOfBusinessID |......

Words: 799 - Pages: 4

Free Essay

Data Base Management

...1. Data dependence Data illustration incorporates with the requisition function. If there is alteration in the Data, then also there is a difference in the application function. Data independence Data depiction incorporates with operation function. If there is a transition in the Data, it won’t cause a shift in the application program. 2. Structured data It is established data which could efficiently be reclaimed and reserved in the databases as well as warehouses. It assign to the substantial case of the user's situation such as phenomenon and development. Unstructured data It consists of combined use of several media data like pictures, sounds, and video clips. Then, it is reserved as the element of the user's field situation 3. Data It is the illustration of articles and episode which are reserved and acknowledged in the system. It persists in a array of form such as numeric, symbols, 3RQ variables, and so on. For example, database in dr's clinic will have information such as patient name, address, diagnosis, symptoms, and phone number. Information These are the refined data which elevates the information of the specific using it. Data are worthless in their current prospective from so it is pre-refined and illustrated as the information to the user 4. Repository It is the rationalised reserved area for data meaning, table, data relationships and other parts of data system. It......

Words: 689 - Pages: 3

Premium Essay

Data Base Assignment

...orders? (a) 9 customers spent more than $500 but less than $1,000 and have at least 3 distinct orders (b) i) Create a table showing each customer’s orders: CREATE TABLE CustomerOrder AS Select CustomerID, Count(orderID) AS CountOrderID, Sum(subtotal) AS CustomerSubtotal FROM ByCustomerOrders GROUP by CustomerID; ii) Show customers that spent more than $500 but less than $1,000 and have at least 3 distinct orders: SELECT * FROM CustomerOrder WHERE CountorderID >= 3 and Customersubtotal < 1000 and Customersubtotal > 500; 5. (15 pts) What are our gross revenues per month? Perform this calculation for the eight months contained within the data set (July 1996 - February 1997). If the data set contained multiple years of order detail, what would be the code to calculate average revenues per month (for example, if the data set contained 5 years of order detail, how would you find the average revenues for January, February, March, April, etc. for those 5 years)? Gross Revenue of 8 months 1) July (a) July Monthly Revenue is 37779.85 (b) SELECT sum(subtotal) as JulyMonthlyRevenue FROM bycustomerorders as bcd left join orders as o on bcd.orderID = o.orderID where orderdate < "1996-08" and orderdate > "1996-07" 2) August (a) August Monthly Revenue is 33285.49 (b) SELECT sum(subtotal) as AugustMonthlyRevenue FROM bycustomerorders as bcd left join orders as o on bcd.orderID = o.orderID where orderdate < "1996-09" and orderdate >......

Words: 1102 - Pages: 5